Table of Contents

Medical Image Classification 20 Times Faster on Tesla K40

Since our last posting about medical image classification on Tesla K20 we have made some more progress. We have moved to Tesla K40M.

Our first major hurdle was to control the temperature of the passively cooled GPU. The Tesla K40M required fans on the server to ensure optimal temperature control.

Thanks to Ilya Goldberg from National Institute on Health and the co-founder of Open Microscopy, who shared his knowledge on the application of computer-aided programs in the medical field. Refer to https://www.openmicroscopy.org for more details.

While porting WindCharm application on GPU, we spend a huge amount of time in understanding Kepler architecture to identify the right techniques for code optimization on GPU.

Understanding register usage by the NVCC driver and its subsequent impact on occupancy can help you to balance register usage and desired occupancy. For more info on registers, click here.

Occupancy is a key parameter in determining optimal usage of GPU. However since occupancy can prove to be deceptive, we explored various technical documents & webinars and performed in-house experiments to validate occupancy’s importance to deliver best performance. The whitepaper linked here explains more on occupancy.

Function units is another area that one need to keep a close watch on.The tenets of accuracy versus speed are important, wherein single and double precision calculations can make or break the game. Visit here to know more about function units.

Now coming back to Tesla K40, we have also started using texture memory. As a result, the performance has improved significantly and we are now able to run the application in under 7 minutes, compared to the 11 minutes on a Tesla K20.

The benchmarking numbers now look like this:

CPU

Dataset Used Number of Consoles Time
RNAi Images 1 4 Hrs 38 Mins
RNAi Images 4 2 Hrs 15 Mins

GPU

Dataset Used Number of Consoles Time
RNAi Images 1 23 Mins (12 X)
RNAi Images 4 6 Mins 44 Sec (20 X)

I recently presented a webinar on this topic with NVIDIA. I will be sharing the slides here soon. Watch this space.

Picture of Lalit Chandivade

Lalit Chandivade

Lalit Chandivade works as a Technical Manager at eInfochips. He has been leading a team at eInfochips on building automated NVMe test suites and enhancing the NVMe test suites on Linux & Windows OS. Lalit has also successfully executed projects in the Linux device drivers & applications in the Storage Area Network domain.

Explore More

Talk to an Expert

Subscribe
to our Newsletter
Stay in the loop! Sign up for our newsletter & stay updated with the latest trends in technology and innovation.

Start a conversation today

Schedule a 30-minute consultation with our Battery Management Solutions Expert

Start a conversation today

Schedule a 30-minute consultation with our Industrial & Energy Solutions Experts

Start a conversation today

Schedule a 30-minute consultation with our Automotive Industry Experts

Start a conversation today

Schedule a 30-minute consultation with our experts

Please Fill Below Details and Get Sample Report

Reference Designs

Our Work

Innovate

Transform.

Scale

Partnerships

Device Partnerships
Digital Partnerships
Quality Partnerships
Silicon Partnerships

Company

Products & IPs