Running Ollama on Kubernetes: A Complete Guide to Local LLM Deployment

Collabnix

Par : Collabnix Team

24 juin 2025 à 18:25

Learn how to deploy and scale Ollama LLM models on Kubernetes clusters for production-ready AI applications

Collabnix
Building RAG Applications with Ollama and Python: Complete 2025 TutorialCollabnix Team
Retrieval-Augmented Generation (RAG) has revolutionized how we build intelligent applications that can access and reason over external knowledge bases. In this comprehensive tutorial, we’ll explore how to build production-ready RAG applications using Ollama and Python, leveraging the latest techniques and best practices for 2025. What is RAG and Why Use Ollama? Retrieval-Augmented Generation combines the […]
24 juin 2025 à 16:30

Building RAG Applications with Ollama and Python: Complete 2025 Tutorial

Collabnix

Par : Collabnix Team

24 juin 2025 à 16:30

Retrieval-Augmented Generation (RAG) has revolutionized how we build intelligent applications that can access and reason over external knowledge bases. In this comprehensive tutorial, we’ll explore how to build production-ready RAG applications using Ollama and Python, leveraging the latest techniques and best practices for 2025. What is RAG and Why Use Ollama? Retrieval-Augmented Generation combines the […]

AI in Real-World Applications: Beyond Code Generation

Collabnix

Par : Collabnix Team

24 juin 2025 à 13:20

A technical exploration of autonomous AI systems that move beyond content generation to real-world execution

Agentic AI in Customer Service: The Complete Technical Implementation Guide for 2025

Collabnix

Par : Collabnix Team

23 juin 2025 à 06:37

Let’s get one thing straight—if you’re still deploying rule-based chatbots in 2025, you’re essentially bringing a flip phone to a smartphone convention. I’ve been in the trenches with AI implementations for years, and I can tell you that the shift from reactive customer service bots to autonomous agentic AI isn’t just evolutionary—it’s revolutionary. And frankly, […]

10 Agentic AI Tools That Will Replace ChatGPT in 2025

Collabnix

Par : Collabnix Team

23 juin 2025 à 04:34

Stop settling for AI that just answers questions. The future belongs to AI that actually does the work. If you’re still using ChatGPT like it’s 2023, you’re about to be left behind. While you’ve been asking ChatGPT to write emails, a revolutionary shift is happening in the AI world—and it’s called Agentic AI. Here’s the […]

Understanding the n8 app and Its Solutions

Collabnix

Par : Tanvir Kour

18 juin 2025 à 06:44

In today’s digital world, we use dozens of different apps and services every day. Email, Slack, Google Sheets, databases, social media, CRM systems – the list goes on. While each tool serves its purpose, getting them to work together smoothly can be a nightmare. Enter n8n (pronounced “n-eight-n”), a powerful workflow automation platform that connects […]

LM Studio vs Ollama: Picking the Right Tool for Local LLM Use

Collabnix

Par : Tanvir Kour

16 juin 2025 à 11:31

LM Studio prioritizes ease of use with a polished GUI ideal for beginners, while Ollama offers greater flexibility and control through its developer-friendly command-line interface and REST API. Choose LM Studio if you want a plug-and-play experience with visual controls, or Ollama if you prefer command-line power and deeper customization options. The landscape of local […]

Publishing AI models to Docker Hub

Docker

Par : Kevin Wittek

11 juin 2025 à 12:16

When we first released Docker Model Runner, it came with built-in support for running AI models published and maintained by Docker on Docker Hub. This made it simple to pull a model like llama3.2 or gemma3 and start using it locally with familiar Docker-style commands.

Model Runner now supports three new commands: tag, push, and package. These enable you to share models with your team, your organization, or the wider community. Whether you’re managing your own fine-tuned models or curating a set of open-source models, Model Runner now lets you publish them to Docker Hub or any other OCI Artifact compatible Container Registry. For teams using Docker Hub, enterprise features like Registry Access Management (RAM) provide policy-based controls and guardrails to help enforce secure, consistent access.

Tagging and pushing to Docker Hub

Let’s start by republishing an existing model from Docker Hub under your own namespace.

# Step 1: Pull the model from Docker Hub
$ docker model pull ai/smollm2

# Step 2: Tag it for your own organization
$ docker model tag ai/smollm2 myorg/smollm2

# Step 3: Push it to Docker Hub
$ docker model push myorg/smollm2

That’s it! Your model is now available at myorg/smollm2 and ready to be consumed using Model Runner by anyone with access.

Pushing to other container registries

Model Runner supports other container registries beyond Docker Hub, including GitHub Container Registry (GHCR).

# Step 1: Tag for GHCR
$ docker model tag ai/smollm2 ghcr.io/myorg/smollm2

# Step 2: Push to GHCR
$ docker model push ghcr.io/myorg/smollm2

Authentication and permissions work just like they do with regular Docker images in the context of GHCR, so you can leverage your existing workflow for managing registry credentials.

Packaging a custom GGUF file

Want to publish your own model file? You can use the package command to wrap a .gguf file into a Docker-compatible OCI artifact and directly push it into a Container Registry, such as Docker Hub.

# Step 1: Download a model, e.g. from HuggingFace
$ curl -L -o model.gguf https://huggingface.co/TheBloke/Mistral-7B-v0.1-GGUF/resolve/main/mistral-7b-v0.1.Q4_K_M.gguf

# Step 2: Package and push it
$ docker model package --gguf "$(pwd)/model.gguf" --push myorg/mistral-7b-v0.1:Q4_K_M

You’ve now turned a raw model file in GGUF format into a portable, versioned, and sharable artifact that works seamlessly with docker model run.

Conclusion

We’ve seen how easy it is to publish your own models using Docker Model Runner’s new tag, push, and package commands. These additions bring the familiar Docker developer experience to the world of AI model sharing. Teams and enterprises using Docker Hub can securely manage access and control for their models, just like with container images, making it easier to scale GenAI applications across teams.

Stay tuned for more improvements to Model Runner that will make packaging and running models even more powerful and flexible.

Learn more

Read our quickstart guide to Docker Model Runner.
Find documentation for Model Runner.
Subscribe to the Docker Navigator Newsletter.
New to Docker? Create an account.
Have questions? The Docker community is here to help.

What is Agentic AI?

Collabnix

Par : Collabnix Team

11 juin 2025 à 09:37

So you’ve probably heard the buzz about “Agentic AI” floating around tech circles lately, right? Maybe you’re wondering if it’s just another fancy buzzword or if there’s actually something revolutionary happening here. Well, let me tell you – this isn’t just hype. We’re looking at what might be the biggest shift in how AI works […]

Agentic AI Trends 2025: The Complete Guide to Autonomous Intelligence Revolution

Collabnix

Par : Collabnix Team

8 juin 2025 à 18:33

Discover the top agentic AI trends 2025 that will transform business operations. From multi-agent systems to enterprise deployment strategies - get expert insights now.

What is Agentic AI? A Deep Dive into MCP and the Modern Agent Ecosystem

Collabnix

Par : Collabnix Team

4 juin 2025 à 03:39

The artificial intelligence landscape is undergoing a fundamental transformation. While traditional AI systems excel at responding to prompts and generating content, a new paradigm is emerging: Agentic AI. These systems don’t just respond—they reason, plan, and act autonomously to achieve complex objectives. At the heart of this revolution lies groundbreaking infrastructure like the Model Context […]

Ollama vs Docker Model Runner: 5 Key Reasons to Switch

Collabnix

Par : Collabnix Team

5 mai 2025 à 06:48

Ollama vs Docker Model Runner: Key Differences Explained In recent months, the LLM deployment landscape has been evolving rapidly, with users experiencing frustration with some existing solutions. A Reddit thread titled “How to move on from Ollama?” highlights growing discontent with Ollama’s performance and reliability issues. As Docker enters this space with Model Runner, it’s […]

Securing the Model Context Protocol: A Comprehensive Guide

Collabnix

Par : Collabnix Team

1 mai 2025 à 14:29

The Model Context Protocol (MCP) represents a significant advancement in AI capabilities, offering a universal interface that connects AI models directly to various data sources and tools. Launched by Anthropic in November 2024, MCP standardizes how applications provide context to LLMs, functioning as a “USB-C port for AI applications.” While MCP offers tremendous potential for […]

Top 10 Interesting MCP Servers You Should Know About in 2025

Collabnix

Par : Collabnix Team

1 mai 2025 à 12:00

Model Control Protocol (MCP) servers represent a significant advancement in the world of AI and Large Language Models (LLMs). These specialized interfaces enable LLMs like Claude, ChatGPT, and others to interact with external tools, APIs, and services, dramatically extending their capabilities beyond simple text generation. Think of MCP servers as bridges that connect the reasoning […]

Running AI Agents Locally with Ollama and AutoGen

Collabnix

Par : Adesoji Alu

17 avril 2025 à 21:19

Have you ever wished you could build smart AI agents without shipping your data to third-party servers? What if I told you you can run powerful language models like Llama3 directly on your machine while building sophisticated AI agent systems? Let’s roll up our sleeves and create a self-contained AI development environment using Ollama and […]

Building AI Agents with n8n: A Complete Guide to Workflow Automation

Collabnix

Par : Adesoji Alu

10 avril 2025 à 02:43

In today’s fast-paced digital environment, automation has become essential for businesses and individuals looking to optimize their workflows. Enter n8n—an open-source, AI-native workflow automation tool that’s rapidly gaining popularity for its powerful capabilities and flexibility. What is n8n? n8n is an open-source workflow automation platform that stands out from competitors like Zapier and Make.com due […]

Introducing Docker Model Runner: A Better Way to Build and Run GenAI Models Locally

Docker

Par : Deanna Sparks

9 avril 2025 à 13:00

Generative AI is transforming software development, but building and running AI models locally is still harder than it should be. Today’s developers face fragmented tooling, hardware compatibility headaches, and disconnected application development workflows, all of which hinder iteration and slow down progress.

That’s why we’re launching Docker Model Runner — a faster, simpler way to run and test AI models locally, right from your existing workflow. Whether you’re experimenting with the latest LLMs or deploying to production, Model Runner brings the performance and control you need, without the friction.

We’re also teaming up with some of the most influential names in AI and software development, including Google, Continue, Dagger, Qualcomm Technologies, HuggingFace, Spring AI, and VMware Tanzu AI Solutions, to give developers direct access to the latest models, frameworks, and tools. These partnerships aren’t just integrations, they’re a shared commitment to making AI innovation more accessible, powerful, and developer-friendly. With Docker Model Runner, you can tap into the best of the AI ecosystem from right inside your Docker workflow.

LLM development is evolving: We’re making it local-first

Local development for applications powered by LLMs is gaining momentum, and for good reason. It offers several advantages on key dimensions such as performance, cost, and data privacy. But today, local setup is complex.

Developers are often forced to manually integrate multiple tools, configure environments, and manage models separately from container workflows. Running a model varies by platform and depends on available hardware. Model storage is fragmented because there is no standard way to store, share, or serve models.

The result? Rising cloud inference costs and a disjoined developer experience. With our first release, we’re focused on reducing that friction, making local model execution simpler, faster, and easier to fit into the way developers already build.

Docker Model Runner: The simple, secure way to run AI models locally

Docker Model Runner is designed to make AI model execution as simple as running a container. With this Beta release, we’re giving developers a fast, low-friction way to run models, test them, and iterate on application code that uses models locally, without all the usual setup headaches. Here’s how:

Running models locally

With Docker Model Runner, running AI models locally is now as simple as running any other service in your inner loop. Docker Model Runner delivers this by including an inference engine as part of Docker Desktop, built on top of llama.cpp and accessible through the familiar OpenAI API. No extra tools, no extra setup, and no disconnected workflows. Everything stays in one place, so you can test and iterate quickly, right on your machine.

Enabling GPU acceleration (Apple silicon)

GPU acceleration on Apple silicon helps developers get fast inference and the most out of their local hardware. By using host-based execution, we avoid the performance limitations of running models inside virtual machines. This translates to faster inference, smoother testing, and better feedback loops.

Standardizing model packaging with OCI Artifacts

Model distribution today is messy. Models are often shared as loose files or behind proprietary download tools with custom authentication. With Docker Model Runner, we package models as OCI Artifacts, an open standard that allows you to distribute and version them through the same registries and workflows you already use for containers. Today, you can easily pull ready-to-use models from Docker Hub. Soon, you’ll also be able to push your own models, integrate with any container registry, connect them to your CI/CD pipelines, and use familiar tools for access control and automation.

Building momentum with a thriving GenAI ecosystem

To make local development seamless, it needs an ecosystem. That starts with meeting developers where they are, whether they’re testing model performance on their local machines or building applications that run these models.

That’s why we’re launching Docker Model Runner with a powerful ecosystem of partners on both sides of the AI application development process. On the model side, we’re collaborating with industry leaders like Google and community platforms like HuggingFace to bring you high-quality, optimized models ready for local use. These models are published as OCI artifacts, so you can pull and run them using standard Docker commands, just like any container image.

But we aren’t stopping at models. We’re also working with application, language, and tooling partners like Dagger, Continue, and Spring AI and VMware Tanzu to ensure applications built with Model Runner integrate seamlessly into real-world developer workflows. Additionally, we’re working with hardware partners like Qualcomm Technologies to ensure high performance inference on all platforms.

As Docker Model Runner evolves, we’ll work to expand its ecosystem of partners, allowing for ample distribution and added functionality.

Where We’re Going

This is just the beginning. With Docker Model Runner, we’re making it easier for developers to bring AI model execution into everyday workflows, securely, locally, and with a low barrier of entry. Soon, you’ll be able to run models on more platforms, including Windows with GPU acceleration, customize and publish your own models, and integrate AI into your dev loop with even greater flexibility (including Compose and Testcontainers). With each Docker Desktop release, we’ll continue to unlock new capabilities that make GenAI development easier, faster, and way more fun to build with.

Try it out now!

Docker Model Runner is now available as a Beta feature in Docker Desktop 4.40. To get started:

On a Mac with Apple silicon
Update to Docker Desktop 4.40
Pull models developed by our partners at Docker’s GenAI Hub and start experimenting
For more information, check out our documentation here.

Try it out and let us know what you think!

How can I learn more about Docker Model Runner?

Check out our available assets today!

Turn your Mac into an AI playground YouTube tutorial
A Quickstart Guide to Docker Model Runner
Docker Model Runner on Docker Docs
Create Local AI Agents with Dagger and Docker Model Runner

Come meet us at Google Cloud Next!

Swing by booth 1530 in the Mandalay Convention Center for hands-on demos and exclusive content.

Introducing Docker Model Runner — Docker’s Model Runner enables developers to run large language models (LLMs) locally inside Docker Desktop eliminating cloud dependencies, reducing latency, ...

Exploring the Llama 4 Herd and what problem does it solve?

Collabnix

Par : Adesoji Alu

8 avril 2025 à 14:10

Hold onto your hats, folks, because the world of Artificial Intelligence has just been given a significant shake-up. Meta has unveiled their latest marvels: the Llama 4 herd, marking what they’re calling “the beginning of a new era of natively multimodal AI innovation”. This isn’t just another incremental update; it’s a leap forward that promises […]

Deep Technical Analysis of Llama 4 Scout, Maverick and Behemoth

Collabnix

Par : Collabnix Team

6 avril 2025 à 04:18

Meta’s release of the Llama 4 family represents a significant architectural leap forward in the domain of Large Language Models (LLMs). This technical deep dive explores the sophisticated architectural components, training methodologies, and performance optimizations that underpin the Llama 4 models, with particular focus on the mixture-of-experts (MoE) architecture and multimodal capabilities that define this […]

Vue normale

Tagging and pushing to Docker Hub

Pushing to other container registries

Packaging a custom GGUF file

Conclusion

Learn more

LLM development is evolving: We’re making it local-first

Docker Model Runner: The simple, secure way to run AI models locally

Running models locally

Enabling GPU acceleration (Apple silicon)

Standardizing model packaging with OCI Artifacts

Building momentum with a thriving GenAI ecosystem

Where We’re Going

Try it out now!

How can I learn more about Docker Model Runner?