Vue lecture

Il y a de nouveaux articles disponibles, cliquez pour rafraîchir la page.

Testcontainers and Playwright

Discover how Testcontainers-Playwright simplifies browser automation and testing without local Playwright installations. Learn about its features, limitations, compatibility, and usage with code examples.

Streamlining Local Development with Dev Containers and Testcontainers Cloud

In today’s world of fast-paced development, setting up a consistent and efficient development environment for all team members is crucial for overall productivity. Although Docker itself is a powerful tool that enhances developer efficiency, configuring a local development environment still can be complex and time-consuming. This is where development containers (or dev containers) come into play.

2400x1260 Testcontainers Cloud evergreen

Dev containers provide an all-encompassing solution, offering everything needed to start working on a feature and running the application seamlessly. In specific terms, dev containers are Docker containers (running locally or remotely) that encapsulate everything necessary for the software development of a given project, including integrated development environments (IDEs), specific software, tools, libraries, and preconfigured services. 

This description of an isolated environment can be easily transferred and launched on any computer or cloud infrastructure, allowing developers and teams to abstract away the specifics of their operating systems. The dev container settings are defined in a devcontainer.json file, which is located within a given project, ensuring consistency across different environments.

However, development is only one part of a developer’s workflow. Another critical aspect is testing to ensure that code changes work as expected and do not introduce new issues. If you use Testcontainers for integration testing or rely on Testcontainers-based services to run your application locally, you must have Docker available from within your dev container. 

In this post, we will show how you can run Testcontainers-based tests or services from within the dev container and how to leverage Testcontainers Cloud within a dev container securely and efficiently to make interacting with Docker even easier. 

Getting started with dev containers

To get started with dev containers on your computer using this tutorial, you will need:

  • Git 2.25+
  • Docker 
  • IntelliJ IDE 

There’s no need to preconfigure your project to support development containers; the IDE will do it for you. But, we will need some Testcontainers usage examples to run in the dev container, so let’s use the existing Java Local Development workshop repository. It contains the implementation of a Spring Boot-based microservice application for managing a catalog of products. The demo-state branch contains the implementation of Testcontainers-based integration tests and services for local development. 

Although this project typically requires Java 21 and Maven installed locally, we will instead use dev containers to preconfigure all necessary tools and dependencies within the development container. 

Setting up your first dev container

To begin, clone the project:

git clone https://github.com/testcontainers/java-local-development-workshop.git

Next, open the project in your local IntelliJ IDE and install the Dev Containers plugin (Figure 1).

Screenshot showing overview of Dev Containers plugin.
Figure 1: Install the Dev Containers plugin.

Next, we will add a .devcontainer/devcontainer.json  file with the requirements to the project. In the context menu of the project root, select New > .devcontainer (Figure 2).

Screenshot of project menu showing selection of "New" and ".devcontainer"
Figure 2: Choosing new .devcontainer.

We’ll need Java 21, so let’s use the Java Dev Container Template. Then, select Java version 21 and enable Install Maven (Figure 3).

Screenshot of dev container configuration page showing Dev Container Template options, with Install Maven selected.
Figure 3: Dev Container Template options.

Select OK, and you’ll see a newly generated devcontainer.json file. Let’s now tweak that a bit more. 

Because Testcontainers requires access to Docker, we need to provide some access to Docker inside of the dev container. Let’s use an existing Development Container Feature to do this. Features enhance development capabilities within your dev container by providing self-contained units of specific container configuration including installation steps, environment variables, and other settings. 

You can add the Docker-in-Docker feature to your devcontainer.json to install Docker into the dev container itself and thus have a Docker environment available for the Testcontainers tests.

Your devcontainer.json file should now look like the following:

{
	"name": "Java Dev Container TCC Demo",
	"image": "mcr.microsoft.com/devcontainers/java:1-21-bullseye",
	"features": {
		"ghcr.io/devcontainers/features/java:1": {
			"version": "none",
			"installMaven": "true",
			"installGradle": "false"
		},
		"docker-in-docker": {
  			"version": "latest",
  			"moby": true,
  			"dockerDashComposeVersion": "v1"
         }
	},
  "customizations" : {
    "jetbrains" : {
      "backend" : "IntelliJ"
    }
  }
}

Now you can run the container. Navigate to devcontainer.json and click on the Dev Containers plugin and select Create Dev Container and Clone Sources. The New Dev Container window will open (Figure 4).

Screenshot of New Dev Container window, with options for Docker, Git Repository, and Git Branch.
Figure 4: New Dev Container window.

In the New Dev Container window, you can select the Git branch and specify where to create your dev container. By default, it uses the local Docker instance, but you can select the ellipses () to add additional Docker servers from the cloud or WSL and configure the connection via SSH.

If the build process is successful, you will be able to select the desired IDE backend, which will be installed and launched within the container (Figure 5).

Screenshot of Building Dev Container window, showing container build progress and details. such as Container ID and Container name.
Figure 5: Dev container build process.

After you select Continue, a new IDE window will open, allowing you to code as usual. To view the details of the running dev container, execute docker ps in the terminal of your host (Figure 6).

Screenshot showing results of "docker ps" command, listing Container ID, Image, Status, Ports, etc.
Figure 6: Viewing dev container details.

If you run the TestApplication class, your application will start with all required dependencies managed by Testcontainers. (For more implementation details, refer to the “Local development environment with Testcontainers” step on GitHub.) You can see the services running in containers by executing docker ps in your IDE terminal (within the container). See Figure 7.

Screenshot showing more details of "docker ps" command, including services running in containers.
Figure 7: Viewing services running in containers.

Setting up Testcontainers Cloud in your dev container

To reduce the load on local resources and enhance the observability of Testcontainers-based containers, let’s switch from the Docker-in-Docker feature to the Testcontainers Cloud (TCC) feature:  ghcr.io/eddumelendez/test-devcontainer/tcc:0.0.2.

This feature will install and run the Testcontainers Cloud agent within the dev container, providing a remote Docker environment for your Testcontainers tests.

To enable this functionality, you’ll need to obtain a valid TC_CLOUD_TOKEN, which the Testcontainers Cloud agent will use to establish the connection. If you don’t already have a Testcontainers Cloud account, you can sign up for a free account. Once logged in, you can create a Service Account to generate the necessary token (Figure 8).

Screenshot of  Testcontainer "Create new Service Account" window, showing generation of Access Token.
Figure 8: Generating Testcontainers Cloud access token.

To use the token value, we’ll utilize an .env file. Create an environment file under .devcontainer/devcontainer.env and add your newly generated token value (Figure 9). Be sure to add devcontainer.env to .gitignore to prevent it from being pushed to the remote repository.

Screenshot of Project menu showing addition of token value to devcontainer.env.
Figure 9: Add your token value to devcontainer.env.

In your devcontainer.json file, include the following options:

  • The runArgs to specify that the container should use the .env file located at .devcontainer/devcontainer.env
  • The containerEnv to set the environment variables TC_CLOUD_TOKEN and TCC_PROJECT_KEY within the container. The TC_CLOUD_TOKEN variable is dynamically set from the local environment variable.

The resulting devcontainer.json file will look like this:

{
  "name": "Java Dev Container TCC Demo",
  "image": "mcr.microsoft.com/devcontainers/java:21",
  "runArgs": [
    "--env-file",
    ".devcontainer/devcontainer.env"
  ],
  "containerEnv": {
    "TC_CLOUD_TOKEN": "${localEnv:TC_CLOUD_TOKEN}",
    "TCC_PROJECT_KEY": "java-local-development-workshop"
  },
  "features": {
    "ghcr.io/devcontainers/features/java:1": {
      "version": "none",
      "installMaven": "true",
      "installGradle": "false"
    },
    "ghcr.io/eddumelendez/test-devcontainer/tcc:0.0.2": {}
  },
  "customizations": {
    "jetbrains": {
      "backend": "IntelliJ"
    }
  }
}

Let’s rebuild and start the dev container again. Navigate to devcontainer.json, select the Dev Containers plugin, then select Create Dev Container and Clone Sources, and follow the steps as in the previous example. Once the build process is finished, choose the necessary IDE backend, which will be installed and launched within the container.

To verify that the Testcontainers Cloud agent was successfully installed in your dev container, run the following in your dev container IDE terminal: 

cat /usr/local/share/tcc-agent.log

You should see a log line similar to Listening address= if the agent started successfully (Figure 10).

Screenshot of log output, including the "Listening address" line, to verify successful installation of Testcontainers Cloud agent.
Figure 10: Verifying successful installation of Testcontainers Cloud agent.

Now you can run your tests. The ProductControllerTest class contains Testcontainers-based integration tests for our application. (For more implementation details, refer to the “Let’s write tests” step on GitHub.)

To view the containers running during the test cycle, navigate to the Testcontainers Cloud dashboard and check the latest session (Figure 11). You will see the name of the Service Account you created earlier in the Account line, and the Project name will correspond to the TCC_PROJECT_KEY defined in the containerEnv section. You can learn more about how to tag your session by project or workflow in the documentation.

Screenshot of Testcontainers Cloud dashboard listing 4 containers and a green Connect button.
Figure 11: Testcontainers Cloud dashboard.

If you want to run the application and debug containers, you can Connect to the cloud VM terminal and access the containers via the CLI (Figure 12).

Screenshot of Testcontainers Cloud showing container access via the command line.
Figure 12: Accessing containers via the CLI.

Wrapping up

In this article, we’ve explored the benefits of using dev containers to streamline your Testcontainers-based local development environment. Using Testcontainers Cloud enhances this setup further by providing a secure, scalable solution for running Testcontainers-based containers by addressing potential security concerns and resource limitations of Docker-in-Docker approach. This powerful combination simplifies your workflow and boosts productivity and consistency across your projects.

Running your dev containers in the cloud can further reduce the load on local resources and improve performance. Stay tuned for upcoming innovations from Docker that will enhance this capability even further.

Learn more

How to Run Hugging Face Models Programmatically Using Ollama and Testcontainers

Hugging Face now hosts more than 700,000 models, with the number continuously rising. It has become the premier repository for AI/ML models, catering to both general and highly specialized needs.

As the adoption of AI/ML models accelerates, more application developers are eager to integrate them into their projects. However, the entry barrier remains high due to the complexity of setup and lack of developer-friendly tools. Imagine if deploying an AI/ML model could be as straightforward as spinning up a database. Intrigued? Keep reading to find out how.

2400x1260 how to run hugging face models programmatically using ollama and testcontainers

Introduction to Ollama and Testcontainers

Recently, Ollama announced support for running models from Hugging Face. This development is exciting because it brings the rich ecosystem of AI/ML components from Hugging Face to Ollama end users, who are often developers. 

Testcontainers libraries already provide an Ollama module, making it straightforward to spin up a container with Ollama without needing to know the details of how to run Ollama using Docker:

import org.testcontainers.ollama.OllamaContainer; 

var ollama = new OllamaContainer("ollama/ollama:0.1.44"); 
ollama.start();

These lines of code are all that is needed to have Ollama running inside a Docker container effortlessly.

Running models in Ollama

By default, Ollama does not include any models, so you need to download the one you want to use. With Testcontainers, this step is straightforward by leveraging the execInContainer API provided by Testcontainers:

ollama.execInContainer("ollama", "pull", "moondream");

At this point, you have the moondream model ready to be used via the Ollama API. 

Excited to try it out? Hold on for a bit. This model is running in a container, so what happens if the container dies? Will you need to spin up a new container and pull the model again? Ideally not, as these models can be quite large.

Thankfully, Testcontainers makes it easy to handle this scenario, by providing an easy-to-use API to commit a container image programmatically:

public void createImage(String imageName) {
var ollama = new OllamaContainer("ollama/ollama:0.1.44");
ollama.start();
ollama.execInContainer("ollama", "pull", "moondream");
ollama.commitToImage(imageName);
}

This code creates an image from the container with the model included. In subsequent runs, you can create a container from that image, and the model will already be present. Here’s the pattern:

var imageName = "tc-ollama-moondream";
var ollama = new OllamaContainer(DockerImageName.parse(imageName)
.asCompatibleSubstituteFor("ollama/ollama:0.1.44"));
try {
ollama.start();
} catch (ContainerFetchException ex) {
// If image doesn't exist, create it. Subsequent runs will reuse the image.
createImage(imageName);
ollama.start();
}

Now, you have a model ready to be used, and because it is running in Ollama, you can interact with its API:

var image = getImageInBase64("/whale.jpeg");
String response = given()
.baseUri(ollama.getEndpoint())
.header(new Header("Content-Type", "application/json"))
.body(new CompletionRequest("moondream:latest", "Describe the image.", Collections.singletonList(image), false))
.post("/api/generate")
.getBody().as(CompletionResponse.class).response();

System.out.println("Response from LLM " + response);

Using Hugging Face models

The previous example demonstrated using a model already provided by Ollama. However, with the ability to use Hugging Face models in Ollama, your available model options have now expanded by thousands. 

To use a model from Hugging Face in Ollama, you need a GGUF file for the model. Currently, there are 20,647 models available in GGUF format. How cool is that?

The steps to run a Hugging Face model in Ollama are straightforward, but we’ve simplified the process further by scripting it into a custom OllamaHuggingFaceContainer. Note that this custom container is not part of the default library, so you can copy and paste the implementation of OllamaHuggingFaceContainer and customize it to suit your needs.

To run a Hugging Face model, do the following:

public void createImage(String imageName, String repository, String model) {
var model = new OllamaHuggingFaceContainer.HuggingFaceModel(repository, model);
var huggingFaceContainer = new OllamaHuggingFaceContainer(hfModel);
huggingFaceContainer.start();
huggingFaceContainer.commitToImage(imageName);
}

By providing the repository name and the model file as shown, you can run Hugging Face models in Ollama via Testcontainers. 

You can find an example using an embedding model and an example using a chat model on GitHub.

Customize your container

One key strength of using Testcontainers is its flexibility in customizing container setups to fit specific project needs by encapsulating complex setups into manageable containers. 

For example, you can create a custom container tailored to your requirements. Here’s an example of TinyLlama, a specialized container for spinning up the DavidAU/DistiLabelOrca-TinyLLama-1.1B-Q8_0-GGUF model from Hugging Face:

public class TinyLlama extends OllamaContainer {

    private final String imageName;

    public TinyLlama(String imageName) {
        super(DockerImageName.parse(imageName)
.asCompatibleSubstituteFor("ollama/ollama:0.1.44"));
        this.imageName = imageName;
    }

    public void createImage(String imageName) {
        var ollama = new OllamaContainer("ollama/ollama:0.1.44");
        ollama.start();
        try {
            ollama.execInContainer("apt-get", "update");
            ollama.execInContainer("apt-get", "upgrade", "-y");
            ollama.execInContainer("apt-get", "install", "-y", "python3-pip");
            ollama.execInContainer("pip", "install", "huggingface-hub");
            ollama.execInContainer(
                    "huggingface-cli",
                    "download",
                    "DavidAU/DistiLabelOrca-TinyLLama-1.1B-Q8_0-GGUF",
                    "distilabelorca-tinyllama-1.1b.Q8_0.gguf",
                    "--local-dir",
                    "."
            );
            ollama.execInContainer(
                    "sh",
                    "-c",
                    String.format("echo '%s' > Modelfile", "FROM distilabelorca-tinyllama-1.1b.Q8_0.gguf")
            );
            ollama.execInContainer("ollama", "create", "distilabelorca-tinyllama-1.1b.Q8_0.gguf", "-f", "Modelfile");
            ollama.execInContainer("rm", "distilabelorca-tinyllama-1.1b.Q8_0.gguf");
            ollama.commitToImage(imageName);
        } catch (IOException | InterruptedException e) {
            throw new ContainerFetchException(e.getMessage());
        }
    }

    public String getModelName() {
        return "distilabelorca-tinyllama-1.1b.Q8_0.gguf";
    }

    @Override
    public void start() {
        try {
            super.start();
        } catch (ContainerFetchException ex) {
            // If image doesn't exist, create it. Subsequent runs will reuse the image.
            createImage(imageName);
            super.start();
        }
    }
}

Once defined, you can easily instantiate and utilize your custom container in your application:

var tinyLlama = new TinyLlama("example");
tinyLlama.start();
String response = given()
.baseUri(tinyLlama.getEndpoint())
.header(new Header("Content-Type", "application/json"))
.body(new CompletionRequest(tinyLlama.getModelName() + ":latest", List.of(new Message("user", "What is the capital of France?")), false))
.post("/api/chat")
.getBody().as(ChatResponse.class).message.content;
System.out.println("Response from LLM " + response);

Note how all the implementation details are under the cover of the TinyLlama class, and the end user doesn’t need to know how to actually install the model into Ollama, what GGUF is, or that to get huggingface-cli you need to pip install huggingface-hub.

Advantages of this approach

  • Programmatic access: Developers gain seamless programmatic access to the Hugging Face ecosystem.
  • Reproducible configuration: All configuration, from setup to lifecycle management is codified, ensuring reproducibility across team members and CI environments.
  • Familiar workflows: By using containers, developers familiar with containerization can easily integrate AI/ML models, making the process more accessible.
  • Automated setups: Provides a straightforward clone-and-run experience for developers.

This approach leverages the strengths of both Hugging Face and Ollama, supported by the automation and encapsulation provided by the Testcontainers module, making powerful AI tools more accessible and manageable for developers across different ecosystems.

Conclusion

Integrating AI models into applications need not be a daunting task. By leveraging Ollama and Testcontainers, developers can seamlessly incorporate Hugging Face models into their projects with minimal effort. This approach not only simplifies the setup of the development environment process but also ensures reproducibility and ease of use. With the ability to programmatically manage models and containerize them for consistent environments, developers can focus on building innovative solutions without getting bogged down by complex setup procedures.

The combination of Ollama’s support for Hugging Face models and Testcontainers’ robust container management capabilities provides a powerful toolkit for modern AI development. As AI continues to evolve and expand, these tools will play a crucial role in making advanced models accessible and manageable for developers across various fields. So, dive in, experiment with different models, and unlock the potential of AI in your applications today.

Stay current on the latest Docker news. Subscribe to the Docker Newsletter.

Learn more

Develop Kubernetes Operators in Java without Breaking a Sweat

Developing Kubernetes operators in Java is not yet the norm. So far, Go has been the language of choice here, not least because of its excellent support for writing corresponding tests. 

One challenge in developing Java-based projects has been the lack of easy automated integration testing that interacts with a Kubernetes API server. However, thanks to the open source library Kindcontainer, based on the widely used Testcontainers integration test library, this gap can be bridged, enabling easier development of Java-based Kubernetes projects. 

In this article, we’ll show how to use Testcontainers to test custom Kubernetes controllers and operators implemented in Java.

2400x1260 develop kubernetes operators in java without breaking a sweat

Kubernetes in Docker

Testcontainers allows starting arbitrary infrastructure components and processes running in Docker containers from tests running within a Java virtual machine (JVM). The framework takes care of binding the lifecycle and cleanup of Docker containers to the test execution. Even if the JVM is terminated abruptly during debugging, for example, it ensures that the started Docker containers are also stopped and removed. In addition to a generic class for any Docker image, Testcontainers offers specialized implementations in the form of subclasses — for components with sophisticated configuration options, for example. 

These specialized implementations can also be provided by third-party libraries. The open source project Kindcontainer is one such third-party library that provides specialized implementations for various Kubernetes containers based on Testcontainers:

  • ApiServerContainer
  • K3sContainer
  • KindContainer

Although ApiServerContainer focuses on providing only a small part of the Kubernetes control plane, namely the Kubernetes API server, K3sContainer and KindContainer launch complete single-node Kubernetes clusters in Docker containers. 

This allows for a trade-off depending on the requirements of the respective tests: If only interaction with the API server is necessary for testing, then the significantly faster-starting ApiServerContainer is usually sufficient. However, if testing complex interactions with other components of the Kubernetes control plane or even other operators is in the scope, then the two “larger” implementations provide the necessary tools for that — albeit at the expense of startup time. For perspective, depending on the hardware configuration, startup times can reach a minute or more.

A first example

To illustrate how straightforward testing against a Kubernetes container can be, let’s look at an example using JUnit 5:

@Testcontainers
public class SomeApiServerTest {
  @Container
  public ApiServerContainer<?> K8S = new ApiServerContainer<>();

  @Test
  public void verify_no_node_is_present() {
    Config kubeconfig = Config.fromKubeconfig(K8S.getKubeconfig());
    try (KubernetesClient client = new KubernetesClientBuilder()
           .withConfig(kubeconfig).build()) {
      // Verify that ApiServerContainer has no nodes
      assertTrue(client.nodes().list().getItems().isEmpty());
    }
  }
}

Thanks to the @Testcontainers JUnit 5 extension, lifecycle management of the ApiServerContainer is easily handled by marking the container that should be managed with the @Container annotation. Once the container is started, a YAML document containing the necessary details to establish a connection with the API server can be retrieved via the getKubeconfig() method. 

This YAML document represents the standard way of presenting connection information in the Kubernetes world. The fabric8 Kubernetes client used in the example can be configured using Config.fromKubeconfig(). Any other Kubernetes client library will offer similar interfaces. Kindcontainer does not impose any specific requirements in this regard.

All three container implementations rely on a common API. Therefore, if it becomes clear at a later stage of development that one of the heavier implementations is necessary for a test, you can simply switch to it without any further code changes — the already implemented test code can remain unchanged.

Customizing your Testcontainers

In many situations, after the Kubernetes container has started, a lot of preparatory work needs to be done before the actual test case can begin. For an operator, for example, the API server must first be made aware of a Custom Resource Definition (CRD), or another controller must be installed via a Helm chart. What may sound complicated at first is made simple by Kindcontainer along with intuitively usable Fluent APIs for the command-line tools kubectl and helm.

The following listing shows how a CRD is first applied from the test’s classpath using kubectl, followed by the installation of a Helm chart:

@Testcontainers
public class FluentApiTest {
  @Container
  public static final K3sContainer<?> K3S = new K3sContainer<>()
    .withKubectl(kubectl -> {
      kubectl.apply.fileFromClasspath(“manifests/mycrd.yaml”).run();
    })
    .withHelm3(helm -> {
      helm.repo.add.run(“repo”, “https://repo.example.com”);
      helm.repo.update.run();
      helm.install.run(“release”, “repo/chart”);
    );
  // Tests go here
}

Kindcontainer ensures that all commands are executed before the first test starts. If there are dependencies between the commands, they can be easily resolved; Kindcontainer guarantees that they are executed in the order they are specified.

The Fluent API is translated into calls to the respective command-line tools. These are executed in separate containers, which are automatically started with the necessary connection details and connected to the Kubernetes container via the Docker internal network. This approach avoids dependencies on the Kubernetes image and version conflicts regarding the available tooling within it.

Selecting your Kubernetes version

If nothing else is specified by the developer, Kindcontainer starts the latest supported Kubernetes version by default. However, this approach is generally discouraged, so the best practice would require you to explicitly specify one of the supported versions when creating the container, as shown:

@Testcontainers
public class SpecificVersionTest {
  @Container
  KindContainer<?> container = new KindContainer<>(KindContainerVersion.VERSION_1_24_1);
  // Tests go here
}

Each of the three container implementations has its own Enum, through which one of the supported Kubernetes versions can be selected. The test suite of the Kindcontainer project itself ensures — with the help of an elaborate matrix-based integration test setup — that the full feature set can be easily utilized for each of these versions. This elaborate testing process is necessary because the Kubernetes ecosystem evolves rapidly, and different initialization steps need to be performed depending on the Kubernetes version.

Generally, the project places great emphasis on supporting all currently maintained Kubernetes major versions, which are released every 4 months. Older Kubernetes versions are marked as @Deprecated and eventually removed when supporting them in Kindcontainer becomes too burdensome. However, this should only happen at a time when using the respective Kubernetes version is no longer recommended.

Bring your own Docker registry

Accessing Docker images from public sources is often not straightforward, especially in corporate environments that rely on an internal Docker registry with manual or automated auditing. Kindcontainer allows developers to specify their own coordinates for the Docker images used for this purpose. However, because Kindcontainer still needs to know which Kubernetes version is being used due to potentially different initialization steps, these custom coordinates are appended to the respective Enum value:

@Testcontainers
public class CustomKubernetesImageTest {
  @Container
  KindContainer<?> container = new KindContainer<>(KindContainerVersion.VERSION_1_24_1
    .withImage(“my-registry/kind:1.24.1”));
  // Tests go here
}

In addition to the Kubernetes images themselves, Kindcontainer also uses several other Docker images. As already explained, command-line tools such as kubectl and helm are executed in their own containers. Appropriately, the Docker images required for these tools are configurable as well. Fortunately, no version-dependent code paths are needed for their execution. 

Therefore, the configuration shown in the following is simpler than in the case of the Kubernetes image:

@Testcontainers
public class CustomFluentApiImageTest {
  @Container
  KindContainer<?> container = new KindContainer<>()
    .withKubectlImage(
      DockerImageName
        .parse(“my-registry/kubectl:1.21.9-debian-10-r10”))
    .withHelm3Image(DockerImageName.parse(“my-registry/helm:3.7.2”));
  // Tests go here
}

The coordinates of the images for all other containers started can also be easily chosen manually. However, it is always the developer’s responsibility to ensure the use of the same or at least compatible images. For this purpose, a complete list of the Docker images used and their versions can be found in the documentation of Kindcontainer on GitHub.

Admission controller webhooks

For the test scenarios shown so far, the communication direction is clear: A Kubernetes client running in the JVM accesses the locally or remotely running Kubernetes container over the network to communicate with the API server running inside it. Docker makes this standard case incredibly straightforward: A port is opened on the Docker container for the API server, making it accessible. 

Kindcontainer automatically performs the necessary configuration steps for this process and provides suitable connection information as Kubeconfig for the respective network configuration.

However, admission controller webhooks present a technically more challenging testing scenario. For these, the API server must be able to communicate with external webhooks via HTTPS when processing manifests. In our case, these webhooks typically run in the JVM where the test logic is executed. However, they may not be easily accessible from the Docker container.

To facilitate testing of these webhooks independently of the network setup, yet still make it simple, Kindcontainer employs a trick. In addition to the Kubernetes container itself, two more containers are started. An SSH server provides the ability to establish a tunnel from the test JVM into the Kubernetes container and set up reverse port forwarding, allowing the API server to communicate back to the JVM. 

Because Kubernetes requires TLS-secured communication with webhooks, an Nginx container is also started to handle TLS termination for the webhooks. Kindcontainer manages the administration of the required certificate material for this. 

The entire setup of processes, containers, and their network communication is illustrated in Figure 1.

Illustration of network setup for testing webhooks, showing JVM on the left side with Webhook server, SSH client, and JUnit test, and Docker network on right side with SSH server, Nginx container, and Kubernetes container.
Figure 1: Network setup for testing webhooks.

Fortunately, Kindcontainer hides this complexity behind an easy-to-use API:

@Testcontainers
public class WebhookTest {
    @Container
    ApiServerContainer<?> container = new ApiServerContainer<>()
.withAdmissionController(admission -> {
        admission.mutating()
                .withNewWebhook("mutating.example.com")
                .atPort(webhookPort) // Local port of webhook
                .withNewRule()
                .withApiGroups("")
                .withApiVersions("v1")
                .withOperations("CREATE", "UPDATE")
                .withResources("configmaps")
                .withScope("Namespaced")
                .endRule()
                .endWebhook()
                .build();
    });

    // Tests go here
}

The developer only needs to provide the port of the locally running webhook along with some necessary information for setting up in Kubernetes. Kindcontainer then automatically handles the configuration of SSH tunneling, TLS termination, and Kubernetes.

Consider Java

Starting from the simple example of a minimal JUnit test, we have shown how to test custom Kubernetes controllers and operators implemented in Java. We have explained how to use familiar command-line tools from the ecosystem with the help of Fluent APIs and how to easily execute integration tests even in restricted network environments. Finally, we have shown how even the technically challenging use case of testing admission controller webhooks can be implemented simply and conveniently with Kindcontainer. 

Thanks to these new testing possibilities, we hope more developers will consider Java as the language of choice for their Kubernetes-related projects in the future.

Learn more

A Promising Methodology for Testing GenAI Applications in Java

In the vast universe of programming, the era of generative artificial intelligence (GenAI) has marked a turning point, opening up a plethora of possibilities for developers.

Tools such as LangChain4j and Spring AI have democratized access to the creation of GenAI applications in Java, allowing Java developers to dive into this fascinating world. With Langchain4j, for instance, setting up and interacting with large language models (LLMs) has become exceptionally straightforward. Consider the following Java code snippet:

public static void main(String[] args) {
    var llm = OpenAiChatModel.builder()
            .apiKey("demo")
            .modelName("gpt-3.5-turbo")
            .build();
    System.out.println(llm.generate("Hello, how are you?"));
}

This example illustrates how a developer can quickly instantiate an LLM within a Java application. By simply configuring the model with an API key and specifying the model name, developers can begin generating text responses immediately. This accessibility is pivotal for fostering innovation and exploration within the Java community. More than that, we have a wide range of models that can be run locally, and various vector databases for storing embeddings and performing semantic searches, among other technological marvels.

Despite this progress, however, we are faced with a persistent challenge: the difficulty of testing applications that incorporate artificial intelligence. This aspect seems to be a field where there is still much to explore and develop.

In this article, I will share a methodology that I find promising for testing GenAI applications.

2400x1260 2024 GenAi

Project overview

The example project focuses on an application that provides an API for interacting with two AI agents capable of answering questions. 

An AI agent is a software entity designed to perform tasks autonomously, using artificial intelligence to simulate human-like interactions and responses. 

In this project, one agent uses direct knowledge already contained within the LLM, while the other leverages internal documentation to enrich the LLM through retrieval-augmented generation (RAG). This approach allows the agents to provide precise and contextually relevant answers based on the input they receive.

I prefer to omit the technical details about RAG, as ample information is available elsewhere. I’ll simply note that this example employs a particular variant of RAG, which simplifies the traditional process of generating and storing embeddings for information retrieval.

Instead of dividing documents into chunks and making embeddings of those chunks, in this project, we will use an LLM to generate a summary of the documents. The embedding is generated based on that summary.

When the user writes a question, an embedding of the question will be generated and a semantic search will be performed against the embeddings of the summaries. If a match is found, the user’s message will be augmented with the original document.

This way, there’s no need to deal with the configuration of document chunks, worry about setting the number of chunks to retrieve, or worry about whether the way of augmenting the user’s message makes sense. If there is a document that talks about what the user is asking, it will be included in the message sent to the LLM.

Technical stack

The project is developed in Java and utilizes a Spring Boot application with Testcontainers and LangChain4j.

For setting up the project, I followed the steps outlined in Local Development Environment with Testcontainers and Spring Boot Application Testing and Development with Testcontainers.

I also use Tescontainers Desktop to facilitate database access and to verify the generated embeddings as well as to review the container logs.

The challenge of testing

The real challenge arises when trying to test the responses generated by language models. Traditionally, we could settle for verifying that the response includes certain keywords, which is insufficient and prone to errors.

static String question = "How I can install Testcontainers Desktop?";
@Test
    void verifyRaggedAgentSucceedToAnswerHowToInstallTCD() {
        String answer  = restTemplate.getForObject("/chat/rag?question={question}", ChatController.ChatResponse.class, question).message();
        assertThat(answer).contains("https://testcontainers.com/desktop/");
    }

This approach is not only fragile but also lacks the ability to assess the relevance or coherence of the response.

An alternative is to employ cosine similarity to compare the embeddings of a “reference” response and the actual response, providing a more semantic form of evaluation. 

This method measures the similarity between two vectors/embeddings by calculating the cosine of the angle between them. If both vectors point in the same direction, it means the “reference” response is semantically the same as the actual response.

static String question = "How I can install Testcontainers Desktop?";
static String reference = """
       - Answer must indicate to download Testcontainers Desktop from https://testcontainers.com/desktop/
       - Answer must indicate to use brew to install Testcontainers Desktop in MacOS
       - Answer must be less than 5 sentences
       """;
@Test
    void verifyRaggedAgentSucceedToAnswerHowToInstallTCD() {
        String answer  = restTemplate.getForObject("/chat/rag?question={question}", ChatController.ChatResponse.class, question).message();
        double cosineSimilarity = getCosineSimilarity(reference, answer);
        assertThat(cosineSimilarity).isGreaterThan(0.8);
    }

However, this method introduces the problem of selecting an appropriate threshold to determine the acceptability of the response, in addition to the opacity of the evaluation process.

Toward a more effective method

The real problem here arises from the fact that answers provided by the LLM are in natural language and non-deterministic. Because of this, using current testing methods to verify them is difficult, as these methods are better suited to testing predictable values. 

However, we already have a great tool for understanding non-deterministic answers in natural language: LLMs themselves. Thus, the key may lie in using one LLM to evaluate the adequacy of responses generated by another LLM. 

This proposal involves defining detailed validation criteria and using an LLM as a “Validator Agent” to determine if the responses meet the specified requirements. This approach can be applied to validate answers to specific questions, drawing on both general knowledge and specialized information

By incorporating detailed instructions and examples, the Validator Agent can provide accurate and justified evaluations, offering clarity on why a response is considered correct or incorrect.

static String question = "How I can install Testcontainers Desktop?";
    static String reference = """
            - Answer must indicate to download Testcontainers Desktop from https://testcontainers.com/desktop/
            - Answer must indicate to use brew to install Testcontainers Desktop in MacOS
            - Answer must be less than 5 sentences
            """;

    @Test
    void verifyStraightAgentFailsToAnswerHowToInstallTCD() {
        String answer  = restTemplate.getForObject("/chat/straight?question={question}", ChatController.ChatResponse.class, question).message();
        ValidatorAgent.ValidatorResponse validate = validatorAgent.validate(question, answer, reference);
        assertThat(validate.response()).isEqualTo("no");
    }

    @Test
    void verifyRaggedAgentSucceedToAnswerHowToInstallTCD() {
        String answer  = restTemplate.getForObject("/chat/rag?question={question}", ChatController.ChatResponse.class, question).message();
        ValidatorAgent.ValidatorResponse validate = validatorAgent.validate(question, answer, reference);
        assertThat(validate.response()).isEqualTo("yes");
    }

We can even test more complex responses where the LLM should suggest a better alternative to the user’s question.

static String question = "How I can find the random port of a Testcontainer to connect to it?";
    static String reference = """
            - Answer must not mention using getMappedPort() method to find the random port of a Testcontainer
            - Answer must mention that you don't need to find the random port of a Testcontainer to connect to it
            - Answer must indicate that you can use the Testcontainers Desktop app to configure fixed port
            - Answer must be less than 5 sentences
            """;

    @Test
    void verifyRaggedAgentSucceedToAnswerHowToDebugWithTCD() {
        String answer  = restTemplate.getForObject("/chat/rag?question={question}", ChatController.ChatResponse.class, question).message();
        ValidatorAgent.ValidatorResponse validate = validatorAgent.validate(question, answer, reference);
        assertThat(validate.response()).isEqualTo("yes");
    }

Validator Agent

The configuration for the Validator Agent doesn’t differ from that of other agents. It is built using the LangChain4j AI Service and a list of specific instructions:

public interface ValidatorAgent {
    @SystemMessage("""
                ### Instructions
                You are a strict validator.
                You will be provided with a question, an answer, and a reference.
                Your task is to validate whether the answer is correct for the given question, based on the reference.
                
                Follow these instructions:
                - Respond only 'yes', 'no' or 'unsure' and always include the reason for your response
                - Respond with 'yes' if the answer is correct
                - Respond with 'no' if the answer is incorrect
                - If you are unsure, simply respond with 'unsure'
                - Respond with 'no' if the answer is not clear or concise
                - Respond with 'no' if the answer is not based on the reference
                
                Your response must be a json object with the following structure:
                {
                    "response": "yes",
                    "reason": "The answer is correct because it is based on the reference provided."
                }
                
                ### Example
                Question: Is Madrid the capital of Spain?
                Answer: No, it's Barcelona.
                Reference: The capital of Spain is Madrid
                ###
                Response: {
                    "response": "no",
                    "reason": "The answer is incorrect because the reference states that the capital of Spain is Madrid."
                }
                """)
    @UserMessage("""
            ###
            Question: {{question}}
            ###
            Answer: {{answer}}
            ###
            Reference: {{reference}}
            ###
            """)
    ValidatorResponse validate(@V("question") String question, @V("answer") String answer, @V("reference") String reference);

    record ValidatorResponse(String response, String reason) {}
}

As you can see, I’m using Few-Shot Prompting to guide the LLM on the expected responses. I also request a JSON format for responses to facilitate parsing them into objects, and I specify that the reason for the answer must be included, to better understand the basis of its verdict.

Conclusion

The evolution of GenAI applications brings with it the challenge of developing testing methods that can effectively evaluate the complexity and subtlety of responses generated by advanced artificial intelligences. 

The proposal to use an LLM as a Validator Agent represents a promising approach, paving the way towards a new era of software development and evaluation in the field of artificial intelligence. Over time, we hope to see more innovations that allow us to overcome the current challenges and maximize the potential of these transformative technologies.

Learn more

Shorter Feedback Loops Developing Java Apps with Digma’s Free Docker Extension

Many engineering teams follow a similar process when developing Docker images. This process encompasses activities such as development, testing, and building, and the images are then released as the subsequent version of the application. During each of these distinct stages, a significant quantity of data can be gathered, offering valuable insights into the impact of code modifications and providing a clearer understanding of the stability and maturity of the product features.

Observability has made a huge leap in recent years, and advanced technologies such as OpenTelemetry (OTEL) and eBPF have simplified the process of collecting runtime data for applications. Yet, for all that progress, developers may still be on the fence about whether and how to use this new resource. The effort required to transform the raw data into something meaningful and beneficial for the development process may seem to outweigh the advantages offered by these technologies.

The practice of continuous feedback envisions a development flow in which this particular problem has been solved. By making the evaluation of code runtime data continuous, developers can benefit from shorter feedback loops and tighter control of their codebase. Instead of waiting for issues to develop in production systems, or even surface in testing, various developer tools can watch the code observability data for you and provide early warning and rigorous linting for regressions, code smells, or issues. 

At the same time, gathering information back into the code from multiple deployment environments, such as the test or production environment, can help developers understand how a specific function, query, or event is performing in the real world at a single glance.

Docker and Digma logos on orange background

Meet Digma, the first linter for your runtime data 

Digma is a free developer tool that was created to bridge the continuous feedback gap in the DevOps loop. It aims to make sense of the gazillion metrics traces and logs the code is spewing out — so that developers won’t have to. And, it does this continuously and automatically. 

To make the tool even more practical, Digma processes the observability data and uses it to lint the source code itself, right in the IDE. From a developer’s perspective, this means a whole new level of information about the code is revealed — allowing for better code design based on real-world feedback, as well as quick turnaround for fixing regression or avoiding common pitfalls and anti-patterns.

How Digma is deployed

Digma is packaged as a self-contained Docker extension to make it easy for developers to evaluate their code without registration to an external service or sharing any data that might not be allowed by corporate policy (Figure 1). As such, Digma acts as your own intelligent agent for monitoring code execution, especially in development and testing. The backend component is able to track execution over time and highlight any inadvertent changes to behavior, performance, or error patterns that should be addressed.  

To collect data about the code, behind the scenes Digma leverages OpenTelemetry, a widely used open standard for modern observability. To make getting started easier, Digma takes care of the configuration work for you, so from the developer’s point of view, getting started with continuous feedback is equivalent to flipping a switch.

 Illustration of Digma process showing feedback providers and observability sources.
Figure 1: Overview of Digma.

Once the information is collected using OTEL, Digma runs it through what could be described as a reverse pipeline, aggregating the data, analyzing it and providing it via an API to multiple outlets including the IDE plugin, Jaeger, and the Docker extension dashboard that we’ll describe in this example.

Note: Currently, Digma supports Java applications running on IntelliJ IDEA; however, support for additional languages and platforms will be rolling out in the coming months.

Installing Digma

Prerequisites: Docker Desktop 4.8 or later.

Note: You must ensure that the Docker Extension is enabled (Figure 2).

Screenshot of Docker Desktop showing "Enable Docker Extensions" selected with blue checkmark.
Figure 2: Enabling Docker Extensions.

Step 1. Installing the Digma Docker Extension

In the Extensions Marketplace, search for Digma and select Install (Figure 3).

Screenshot of Extensions Marketplace showing results of search for Digma.
Figure 3: Install Digma.

Step 2. Collecting feedback about your code

After installing the Digma extension from the marketplace, you’ll be directed to a quickstart page that will guide you through the next steps to collect feedback about your code. Digma can collect information from two different sources:

  • Your tests/debug sessions within the IDE
  • Any Docker containers you may have running in Docker desktop

Getting feedback while debugging/running code in the IDE

Digma’s IDE extension makes the process of collecting data about your code in the IDE trivial. The entire setup is reduced to a single toggle button click (Figure 4). Behind the scenes, Digma adds variables to the runtime configuration, so that it uses OpenTelemetry for observability. No prior knowledge or observability is needed, and no code changes are necessary to get this to work. 

With the observability toggle enabled, you’ll start getting immediate feedback as you run your code locally, or even when you execute your tests. Digma will be on the lookout for any regressions and will automatically spot code smells and issues.

Screenshot of Digma dialog box with Observability toggle switch enabled (in blue).
Figure 4: Observability toggle switch.

Getting feedback from running containers

In addition to the IDE, you can also choose to collect data from any running container on your machine running Java code. The process to do that is extremely simple and also documented in the Digma extension Getting Started page. It does not involve changing any Dockerfile or code. Basically, you download the OTEL agent, mount it to a volume on the container, and set some environment variables that point the Java process to use it.

curl --create-dirs -O -L --output-dir ./otel
https://github.com/open-telemetry/opentelemetry-java-instrumentation/releases/latest/download/opentelemetry-javaagent.jar

curl --create-dirs -O -L --output-dir ./otel
https://github.com/digma-ai/otel-java-instrumentation/releases/latest/download/digma-otel-agent-extension.jar

export JAVA_TOOL_OPTIONS="-javaagent:/otel/javaagent.jar -Dotel.exporter.otlp.endpoint=http://localhost:5050 -Dotel.javaagent.extensions=/otel/digma-otel-agent-extension.jar"
export OTEL_SERVICE_NAME={--ENTER YOUR SERVICE NAME HERE--}
export DEPLOYMENT_ENV=LOCAL_DOCKER

docker run -d -v "/$(pwd)/otel:/otel" --env JAVA_TOOL_OPTIONS --env OTEL_SERVICE_NAME --env DEPLOYMENT_ENV {-- APPEND PARAMS AND REPO/IMAGE --}

Uncovering how the code behaves in runtime

Whether you collect data from running containers or in the IDE, Digma will continuously analyze the code behavior and make the analysis results available both as code annotations in the IDE and as dashboards in the Digma extension itself. To demonstrate a live example, we’ve created a sample application based on the Java Spring Boot PetClinic app. 

In this scenario, we’ll clone the repo and run the code from the IDE, triggering a few actions to see what they reveal about the code. For convenience, we’ve created run configurations to simulate common API calls and create interesting data:

  1. Open the PetClinic project here in your IntelliJ IDE.
  2. Run the petclinic-service configuration.
  3. Access the API and play around with it on http://localhost:9753/ or simply run the ClientTested run config included in the workspace.

Almost immediately, the result of the dynamic linting analysis will start appearing over the code in the IDE (Figure 5):

Screenshot showing Digma Insights after analysis.
Figure 5: Results of Digma analysis.

At the same time, the Digma extension itself will present a dashboard cataloging all of the assets that have been identified, including code locations, API endpoints, database queries, and more (Figure 6). For each asset, you’ll be able to access basic statistics about its performance as well as a more detailed list of issues of concern that may need to be tracked about their runtime behavior.

 Screenshot of Digma dashboard listing assets that have been identified.
Figure 6: Digma dashboard.

Using the observability data

One of the main problems Digma tries to solve is not how to collect observability data but how to turn it into a useful and practical asset that can speed up development processes and improve the code. Digma’s insights can be directly applied during design time, based on existing data, as well as to validate changes as they are being run in dev/test/prod and to get early feedback and shorter loops into the development process.

Examples of design time planning code insights:

  • See runtime usage for the modified functions and understand who will be affected by the change
  • Concurrency to plan for, bottlenecks and criticality
  • Existing errors and exceptions that need to be handled by the new component
  • See complete visualization of the flow of control using the modified code 

Examples of runtime code validations for shorter feedback loops:

  • Catch code and query modeling smells, such as N+1 Selects, are detected and highlighted
  • Identify new performance bottlenecks and regressions as a result of the changes
  • Spot scaling issues earlier and address them as a part of the dev cycle

As Digma evolves, it continues to track and provide clear visibility into specific areas that are important to developers with the goal of having complete clarity about how each piece of code is behaving in real-world scenarios and catching issues and regressions much earlier in the process.

Who should use Digma?

Unlike many other observability solutions, Digma is code first and developer first. Digma is completely free for developers, does not require any code changes or sharing data, and will get you from zero to impactful data about your code within minutes.  If you are working on Java code, use the JetBrains IDE, and you want to improve your code with actual execution data, you can get started by picking up the Digma extension from the marketplace. 

You can provide feedback in our Slack channel and tell us where Digma improved your dev cycle.

❌