NVIDIA, the pioneer in accelerated computing and AI technology, is seeking a Senior Software Engineer specializing in Deep Learning Inference. This role sits at the intersection of AI innovation and performance optimization, working with cutting-edge generative AI models. The position involves building software solutions that enable efficient inference on state-of-the-art models, tackling challenges across the entire stack from server-level request batching to GPU kernel fusion. The ideal candidate will collaborate with research teams to integrate new Large Language Models (LLMs) and Vision Language Models (VLMs) into NVIDIA's opensource AI runtimes, optimize inference workloads, and build robust, scalable systems. The role requires expertise in performance optimization, strong software engineering principles, and deep understanding of machine learning concepts. Working at NVIDIA offers the opportunity to be at the forefront of AI innovation, collaborating with world-class teams to push the boundaries of what's possible with hardware acceleration. The company is known for its inclusive culture and commitment to diversity, offering a chance to work on transformative technology that impacts industries worldwide. This position is based in Tel Aviv, Israel, and requires hands-on experience with GPU programming and AI frameworks.