
Ultrasound stands as a cornerstone in medical diagnostics, celebrated for its safety, real-time capabilities, portability, and cost-effectiveness. For decades, however, the process of forming ultrasound images has relied on a hand-engineered reconstruction pipeline. This traditional approach compresses rich raw sensor measurements into a final image, often making simplifying assumptions about physics, such as a constant speed of sound throughout the body.
In our modern era of artificial intelligence and foundation models, a critical question emerges: can we transcend the limitations of conventional beamforming? Is it possible to learn directly from raw ultrasound sensor data, leveraging information typically discarded during reconstruction, and what new diagnostic capabilities might this unlock?
NVIDIA and researchers from Siemens Healthineers joined forces to explore these very questions. Their groundbreaking collaboration has culminated in a revolutionary reconstruction model, which we are proud to release as NV-Raw2Insights-US.
Beyond the Image: Understanding Sound with Raw2Insights
At its core, ultrasound is not merely an image; it’s a profound interaction of sound waves within the body. What clinicians ultimately view on screen is a reconstructed picture, painstakingly built from millions of tiny echoes returning from tissue. Yet, in this traditional reconstruction process, a significant portion of the original signal – the nuanced richness of how sound truly traverses different tissues – is often simplified or lost entirely.
Our innovative approach begins earlier in the data chain. Instead of processing finished images, NV-Raw2Insights-US learns directly from the raw signals captured by the ultrasound probe. This direct access provides the closest representation of how sound genuinely interacts with the body, allowing the model to “listen” with unprecedented precision.
This deep learning enables the system to understand how each patient’s unique biological makeup shapes sound waves. Our overarching vision is to enable end-to-end AI for ultrasound imaging, and this initial step represents the foundation of our Raw2Insights class of models. It’s a fundamental shift from simply processing data to actively understanding the underlying physics.
In this inaugural Raw2Insights application, we focus on estimating the speed of sound for adaptive image focusing. The result is a sophisticated system that can generate a personalized map of sound speed for each patient in real time. This allows the system to instantaneously correct and improve the ultrasound image, a task that once required complex, time-consuming computations.
Seamless Integration: Deploying AI at the Edge
Accessing raw ultrasound channel data on clinical-grade scanners typically presents a significant challenge due to its extremely high bandwidth. To overcome this, NVIDIA developed the Holoscan Sensor Bridge (HSB), an open-source FPGA IP designed for high-bandwidth, low-latency data transfer to the GPU.
Through an ingenious innovation dubbed Data over DisplayPort, an Altera Agilex-7 FPGA development kit, paired with NVIDIA HSB, enables raw ultrasound channel data to stream directly from an ACUSON Sequoia ultrasound scanner’s DisplayPort outputs. This demonstrates how modern, high-performance computational capacity can be seamlessly integrated with existing scanner architectures.
Once captured, the NVIDIA HSB packetizes this critical data and transmits it over Ethernet to an NVIDIA IGX platform for both data collection and AI inference. We deploy NV-Raw2Insights-US using NVIDIA Holoscan, an advanced edge AI sensor processing platform specifically engineered for high-performance, real-time workloads on systems like NVIDIA IGX Thor and NVIDIA DGX Spark.
With the raw data securely in GPU memory, NV-Raw2Insights-US executes accelerated inference on a Blackwell-class GPU, yielding a patient-specific sound-speed estimate in milliseconds. This precise estimate is then streamed back to the ultrasound scanner, enabling dynamic and improved focus directly within the live imaging stream. The entire process transforms raw data into actionable insights for superior diagnostic clarity.
The Future of Medical Imaging: Software-Defined Ultrasound
This demonstration architecture provides unparalleled flexibility, benefiting both development and deployment phases. Firstly, it allows for software-only integration, meaning NVIDIA acceleration of existing medical devices is possible with minimal modifications, primarily utilizing the innovative Data over DisplayPort technology.
Secondly, this represents a significant leap towards software-defined ultrasound, enabling continuous improvement through seamless software updates and enhancements. This contrasts sharply with traditional hardware-centric upgrades. Lastly, with raw ultrasound channel data already residing in GPU memory, the architecture supports modular expansion, allowing new AI models to be integrated effortlessly, scaling diagnostic capabilities.
By shifting ultrasound intelligence from rigid, traditional algorithms to an agile, AI-driven Raw2Insights pipeline, we unlock a scalable path to truly AI-native imaging. Learning directly from raw ultrasound channel data, rather than reconstructed images, NV-Raw2Insights-US reduces errors inherent in traditional assumptions and effectively adapts imaging for each unique patient.
This architecture not only significantly improves image clarity and diagnostic precision today, but also establishes a robust, modular foundation for the next generation of AI-powered diagnostic systems. Developers eager to explore this cutting-edge technology can get started by accessing the project on GitHub, along with detailed model weights and relevant datasets.
Source: Hugging Face Blog