Enterprise AI Trend Report: Gain insights on ethical AI, MLOps, generative AI, large language models, and much more.
2024 Cloud survey: Share your insights on microservices, containers, K8s, CI/CD, and DevOps (+ enter a $750 raffle!) for our Trend Reports.
Containers allow applications to run quicker across many different development environments, and a single container encapsulates everything needed to run an application. Container technologies have exploded in popularity in recent years, leading to diverse use cases as well as new and unexpected challenges. This Zone offers insights into how teams can solve these challenges through its coverage of container performance, Kubernetes, testing, container orchestration, microservices usage to build and deploy containers, and more.
DZone's Cloud Native Research: Join Us for Our Survey (and $750 Raffle)!
Patch Management and Container Security
DynamoDB Local is a version of Amazon DynamoDB that you can run locally as a Docker container (or other forms). It's super easy to get started: # start container docker run --rm -p 8000:8000 amazon/dynamodb-local # connect and create a table aws dynamodb create-table --endpoint-url http://localhost:8000 --table-name Books --attribute-definitions AttributeName=ISBN,AttributeType=S --key-schema AttributeName=ISBN,KeyType=HASH --billing-mode PAY_PER_REQUEST # list tables aws dynamodb list-tables --endpoint-url http://localhost:8000 More on the --endpoint-url soon. Hello Testcontainers! This is a good start. But DynamoDB Local is a great fit for Testcontainers which "is an open source framework for providing throwaway, lightweight instances of databases, message brokers, web browsers, or just about anything that can run in a Docker container." It supports multiple languages (including Go!) and databases (also messaging infrastructure, etc.); All you need is Docker. Testcontainers for Go makes it simple to programmatically create and clean up container-based dependencies for automated integration/smoke tests. You can define test dependencies as code, run tests and delete the containers once done. Testcontainers has the concept of modules that are "preconfigured implementations of various dependencies that make writing your tests even easier." Having a piece of infrastructure supported as a Testcontainer module provides a seamless, plug-and-play experience. The same applies to DynamoDB Local, where the Testcontainers module for DynamoDB Local comes in! It allows you to easily run/test your Go-based DynamoDB applications locally using Docker. Getting Started With the Testcontainers Module for DynamoDB Local Super easy! go mod init demo go get github.com/abhirockzz/dynamodb-local-testcontainers-go You can go ahead and use the sample code in the project README. To summarize, it consists of four simple steps: Start the DynamoDB Local Docker container, dynamodblocal.RunContainer(ctx). Gets the client handle for the DynamoDB (local) instance, dynamodbLocalContainer.GetDynamoDBClient(context.Background()). Uses the client handle to execute operations. In this case, create a table, add an item, and query that item. Terminate it at the end of the program (typically register it using defer), dynamodbLocalContainer.Terminate(ctx). Module Options The following configuration parameters are supported: WithTelemetryDisabled: When specified, DynamoDB local will not send any telemetry. WithSharedDB: If you use this option, DynamoDB creates a shared database file in which data is stored. This is useful if you want to persist data, e.g., between successive test executions. To use WithSharedDB, here is a common workflow: Start the container and get the client handle. Create a table, add data, and query it. Re-start container Query the same data (again); it should be there. And here is how you might go about it (error handling and logging omitted): func withSharedDB() { ctx := context.Background() //start container dynamodbLocalContainer, _ := dynamodblocal.RunContainer(ctx) defer dynamodbLocalContainer.Terminate(ctx) //get client client, _ := dynamodbLocalContainer.GetDynamoDBClient(context.Background()) //create table, add data createTable(client) value := "test_value" addDataToTable(client, value) //query same data queryResult, _ := queryItem(client, value) log.Println("queried data from dynamodb table. result -", queryResult) //re-start container dynamodbLocalContainer.Stop(context.Background(), aws.Duration(5*time.Second)) dynamodbLocalContainer.Start(context.Background()) //query same data client, _ = dynamodbLocalContainer.GetDynamoDBClient(context.Background()) queryResult, _ = queryItem(client, value) log.Println("queried data from dynamodb table. result -", queryResult) } To use these options together: container, err := dynamodblocal.RunContainer(ctx, WithSharedDB(), WithTelemetryDisabled()) The Testcontainers documentation is pretty good in terms of detailing how to write an extension/module. But I had to deal with a specific nuance - related to DynamoDB Local. DynamoDB Endpoint Resolution Contrary to the DynamoDB service, in order to access DynamoDB Local (with the SDK, AWS CLI, etc.), you must specify a local endpoint - http://<your_host>:<service_port>. Most commonly, this is what you would use: http://locahost:8000. The endpoint resolution process has changed since AWS SDK for Go v2 - I had to do some digging to figure it out. You can read up in the SDK documentation, but the short version is that you have to specify a custom endpoint resolver. In this case, all it takes is to retrieve the docker container host and port. Here is the implementation, this is used in the module as well. type DynamoDBLocalResolver struct { hostAndPort string } func (r *DynamoDBLocalResolver) ResolveEndpoint(ctx context.Context, params dynamodb.EndpointParameters) (endpoint smithyendpoints.Endpoint, err error) { return smithyendpoints.Endpoint{ URI: url.URL{Host: r.hostAndPort, Scheme: "http"}, }, nil } This Was Fun! As I mentioned, Testcontainers has excellent documentation, which was helpful as I had to wrap my head around how to support, the shared flag (using WithSharedDB). The solution was easy (ultimately), but the Reusable container section was the one which turned on the lightbulb for me! If you find this project interesting/helpful, don't hesitate to ⭐️ it and share it with your colleagues. Happy Building!
“Top” is a robust, lightweight command-line tool that provides real-time reports on system-wide resource utilization. It is commonly available in various Linux distributions. However, we have observed that it may not accurately report information when executed within a Docker container. This post aims to bring this issue to your attention. CPU Stress Test in Docker Container Let’s carry out a straightforward experiment. We’ll deploy a container using an Ubuntu image and intentionally increase CPU consumption. Execute the following command: Shell docker run -ti --rm --name tmp-limit --cpus="1" -m="1G" ubuntu bash -c 'apt update; apt install -y stress; stress --cpu 4' The provided command performs the following actions: Initiates a container using the Ubuntu image Establishes a CPU limit of 1 Sets a memory limit of 1G Executes the command ‘apt update; apt install -y stress; stress –cpu 4’, which conducts a CPU stress test CPU utilization reported by the top in the host Now, let’s initiate the top tool on the host where this Docker container is operating. The output of the top tool is as follows: Fig 1: top command from the host Please take note of the orange rectangle in Fig 1. This metric is indicated as 25% CPU utilization, and it is the correct value. The host has 4 cores, and we have allocated our container with a limit of 1 core. As this single core is fully utilized, the reported CPU utilization at the host level is 25% (i.e., 1/4 of the total cores). CPU Utilization Reported by the Top in the Container Now, let’s execute the top command within the container. The following is the output reported by the top command: Fig 2: top command from the container Please observe the orange rectangle in Fig 2. The CPU utilization is noted as 25%, mirroring the host’s value. This, however, is inaccurate from the container’s viewpoint as it has fully utilized its allotted CPU limit of 100%. Nevertheless, it’s important to note that the processes listed in Fig 2 are accurate. The tool correctly reports only the processes running within this container and excludes processes from the entire host. How To Find Accurate CPU Utilization in Containers In such a scenario, to obtain accurate CPU utilization within the container, there are several solutions: Docker Container Stats (docker stats) Container Advisor (cAdvisor) yCrash 1. Docker Stats The docker stats command provides fundamental resource utilization metrics at the container level. Here is the output of `docker stats` for the previously launched container: Fig 3: docker stats output Note the orange rectangle in Fig 3. The CPU utilization is indicated as 100.64%. However, the challenge lies in the fact that `docker stats` cannot be executed within the container (unless the docker socket is passed into the container, which is uncommon and poses a security risk). It must be run from the host. 2. cAdvisor You can utilize the cAdvisor (Container Advisor) tool, which inherently supports Docker containers, to furnish container-level resource utilization metrics. 3. yCrash Fig 4: yCrash – root cause analysis report Additionally, you have the option to employ the yCrash tool, which not only provides container-level metrics but also analyzes application-level dumps (such as Garbage Collection logs, application logs, threads, memory dumps, etc.) and presents a comprehensive root cause analysis report. Conclusion While “top” serves as a reliable tool for monitoring system-wide resource utilization, its accuracy within Docker containers may be compromised. This discrepancy can lead to misleading insights into container performance, especially regarding CPU utilization. As demonstrated in our experiment, “top” reported 25% CPU usage within the container despite full utilization of the allocated CPU limit. To obtain precise metrics within Docker containers, alternative tools such as Docker Container Stats, cAdvisor, and yCrash offer valuable insights into resource utilization. By leveraging these tools, users can ensure accurate monitoring and optimization of containerized environments, ultimately enhancing performance and operational efficiency.
In the ever-evolving landscape of cloud-native computing, containers have emerged as the linchpin, enabling organizations to build, deploy, and scale applications with unprecedented agility. However, as the adoption of containers accelerates, so does the imperative for robust container security strategies. The interconnected realms of containers and the cloud have given rise to innovative security patterns designed to address the unique challenges posed by dynamic, distributed environments. Explore the latest patterns, anti-patterns, and practices that are steering the course in an era where cloud-native architecture, including orchestration intricacies of Kubernetes that span across Amazon Elastic Kubernetes Service (EKS), Azure Kubernetes Service (AKS), Google Kubernetes Engine (GKE), including nuances of securing microservices. Related: Amazon ETL Tools Compared. What Is Container Security? Container security is the practice of ensuring that container environments are protected against any threats. As with any security implementation within the software development lifecycle (SDLC), the practice of securing containers is a crucial step to take, as it not only protects against malicious actors but also allows containers to run smoothly in production. Learn how to incorporate CI/CD pipelines into your SDLC. The process of securing containers is a continuous one and can be implemented on the infrastructure level, runtime, and the software supply chain, to name a few. As such, securing containers is not a one-size-fits-all approach. In future sections, we will discuss different container management strategies and how security comes into play. Review additional CI/CD design patterns. How to Build a Container Strategy With Security Forensics Embdedded A container management strategy involves a structured plan to oversee the creation, deployment, orchestration, maintenance, and discarding of containers and containerized applications. It encompasses key elements to ensure efficiency, security, and scalability throughout the software development lifecycle based around containerization. Let's first analyze the prevailing and emerging anti-patterns for container management and security. Then, we will try to correlate possible solutions or alternative recommendations corresponding to each anti-pattern along with optimization practices for fortifying container security strategies for today's and tomorrow's threats. Review more DevOps anti-pattern examples. "Don't treat container security like a choose-your-own-adventure book; following every path might lead to a comedy of errors, not a happy ending!" Container Security Best Practices Weak Container Supply Chain Management This anti-pattern overlooks container supply chain management visible in "Docker history," risking compromised security. Hastily using unofficial Docker images without vetting their origin or build process poses a significant threat. Ensuring robust container supply chain management is vital for upholding integrity and security within the container environment. Learn how to perform a docker container health check. Anti-Pattern: Potential Compromise Pushing malicious code into Docker images is straightforward, but detecting such code is challenging. Blindly using others' images or building new ones from these can risk security, even if they solve similar problems. Pattern: Secure Practices Instead of relying solely on others' images, inspect their Dockerfiles, emulate their approach, and customize them for your needs. Ensure FROM lines in the Dockerfile point to trusted images, preferably official ones or those you've crafted from scratch, despite the added effort, ensuring security over potential breach aftermaths. Installing Non-Essential Executables Into a Container Image Non-essential executables for container images encompass anything unnecessary for the container's core function or app interpreter. For production, omit tools like text editors. Java or Python apps may need specific executables, while Go apps can run directly from a minimal "scratch" base image. Anti-Pattern: Excessive Size Adding non-essential executables to a container amplifies vulnerability risks and enlarges image size. This surplus bulk slows pull times and increases network data transmission. Pattern: Trim the Fat Start with a minimal official or self-generated base image to curb potential threats. Assess your app's true executable necessities, avoiding unnecessary installations. Exercise caution while removing language-dependent executables to craft a lean, cost-effective container image. Cloning an Entire Git Repo Into a Container Image It could look something like : GitHub Flavored Markdown RUN git clone https://github.org/somerepo Anti-Pattern: Unnecessary Complexity External dependency: Relying on non-local sources for Docker image files introduces risk, as these files may not be vetted beforehand. Git clutter: A git clone brings surplus files like the .git/ directory, increasing image size. The .git/ folder may contain sensitive information, and removing it is error-prone. Network dependency: Depending on container engine networking to fetch remote files adds complexity, especially with corporate proxies, potentially causing build errors. Executable overhead: Including the Git executable in the image is unnecessary unless directly manipulating Git repositories. Pattern: Streamlined Assembly Instead of a direct git clone in the Dockerfile, clone to a sub-directory in the build context via a shell script. Then, selectively add needed files using the COPY directive, minimizing unnecessary components. Utilize a .dockerignore file to exclude undesired files from the Docker image. Exception: Multi-Stage Build For a multi-stage build, consider cloning the repository to a local folder and then copying it to the build-stage container. While git clone might be acceptable, this approach offers a more controlled and error-resistant alternative. Building a Docker Container Image “On the Fly” Anti-Pattern: Skipping Registry Deployment Performing cloning, building, and running a Docker image without pushing it to an intermediary registry is an anti-pattern. This skips security screenings, lacks a backup, and introduces untested images to deployment. The main reason is that there are security and testing gaps: Backup and rollback: Skipping registry upload denies the benefits of having a backup, which is crucial for quick rollbacks in case of deployment failures. Vulnerability scanning: Neglecting registry uploads means missing out on vulnerability scanning, a key element in ensuring data and user safety. Untested images: Deploying unpushed images means deploying untested ones, a risky practice, particularly in a production environment. DZone's previously covered how to use penetration tests within an organization. Pattern: Registry Best Practices Build and uniquely version images in a dedicated environment, pushing them to a container registry. Let the registry scan for vulnerabilities and ensure thorough testing before deployment. Utilize deployment automation for seamless image retrieval and execution. Running as Root in the Container Anti-Pattern: Defaulting to Root User Many new container users inadvertently run containers with root as the default user, a practice necessitated by container engines during image creation. This can lead to the following security risks: Root user vulnerabilities: Running a Linux-based container as root exposes the system to potential takeovers and breaches, allowing bad actors access inside the network and potentially the container host system. Container breakout risk: A compromised container could lead to a breakout, granting unauthorized root access to the container host system. Pattern: User Privilege Management Instead of defaulting to root, use the USER directive in the Dockerfile to specify a non-root user. Prior to this, ensure the user is created in the image and possesses adequate permissions for required commands, including running the application. This practice reduces security vulnerabilities associated with root user privileges. Running Multiple Services in One Container Anti-Pattern: Co-Locating Multiple Tiers This anti-pattern involves running multiple tiers of an application, such as APIs and databases, within the same container, contradicting the minimalist essence of container design. The complexity and deviation from the design cause the following challenges: Minimalism violation: Containers are meant to be minimalistic instances, focusing on the essentials for running a specific application tier. Co-locating services in a single container introduces unnecessary complexity. Exit code management: Containers are designed to exit when the primary executable ends, relaying the exit code to the launching shell. Running multiple services in one container requires manual management of unexpected exceptions and errors, deviating from container engine handling. Pattern: Service Isolation Adopt the principle of one container per task, ensuring each container hosts a single service. Establish a local virtualized container network (e.g., docker network create) for intra-container communication, enabling seamless interaction without compromising the minimalist design of individual containers. Embedding Secrets in an Image Anti-Pattern: Storing Secrets in Container Images This anti-pattern involves storing sensitive information, such as local development secrets, within container images, often overlooked in various parts like ENV directives in Dockerfiles. This causes the following security compromises: Easy to forget: Numerous locations within container images, like ENV directives, provide hiding spots for storing information, leading to inadvertent negligence and forgetfulness. Accidental copy of secrets: Inadequate precautions might result in copying local files containing secrets, such as .env files, into the container image. Pattern: Secure Retrieval at Runtime Dockerignore best practices: Implement a .dockerignore file encompassing local files housing development secrets to prevent inadvertent inclusion in the container image. This file should also be part of .gitignore. Dockerfile security practices: Avoid placing secrets in Dockerfiles. For secure handling during build or testing phases, explore secure alternatives to passing secrets via --build-arg, leveraging Docker's BuildKit for enhanced security. Runtime secret retrieval: Retrieve secrets at runtime from secure stores like HashiCorp Vault, cloud-based services (e.g., AWS KMS), or Docker's built-in secrets functionality, which requires a docker-swarm setup for utilization. Failing to Update Packages When Building Images Anti-Pattern: Static Base Image Packages This anti-pattern stems from a former best practice where container image providers discouraged updating packages within base images. However, the current best practice emphasizes updating installed packages every time a new image is built. The main reason for this is outdated packages, which causes lagging updates. Base images may not always contain the latest versions of installed packages due to periodic or scheduled image builds, leaving systems vulnerable to outdated packages, including security vulnerabilities. Pattern: Continuous Package Updates To address this, regularly update installed packages using the distribution's package manager within the Dockerfile. Incorporate this process early in the build, potentially within the initial RUN directive, ensuring that each new image build includes updated packages for enhanced security and stability. When striving to devise a foolproof solution, a frequent misstep is to undervalue the resourcefulness of total novices. Building Container Security Into Development Pipelines Creates a Dynamic Landscape In navigating the ever-evolving realm of containers, which are at an all-time high in popularity and directly proportional to the quantum of security threats, we've delved into a spectrum of crucial patterns and anti-patterns. From fortifying container images by mastering the intricacies of supply chain management to embracing the necessity of runtime secrets retrieval, each pattern serves as a cornerstone in the architecture of robust container security. Unraveling the complexities of co-locating services and avoiding the pitfalls of outdated packages, we've highlighted the significance of adaptability and continuous improvement. As we champion the ethos of one-container-per-task and the secure retrieval of secrets, we acknowledge that container security is not a static destination but an ongoing journey. By comprehending and implementing these patterns, we fortify our containers against potential breaches, ensuring a resilient and proactive defense in an ever-shifting digital landscape.
Docker has transformed the world of containerization by providing a powerful platform for packaging, shipping, and running applications within containers. A key aspect of containerization is networking, and Docker offers a range of networking drivers to facilitate communication between containers and with external networks. In this comprehensive guide, we will explore the significance of networking drivers in Docker, how they work, the different types available, and best practices for selecting the right driver to optimize container networking. Docker has revolutionized containerization by offering a strong platform for packing, delivering, and executing container programs. Networking is an important part of containerization, and Docker provides a variety of networking drivers to support communication between containers and with external networks. In this detailed article, we will look at the importance of networking drivers in Docker, how they function, the many types available, and best practices for picking the proper driver to optimize container networking. Docker, the containerization industry leader, is changing the way applications are deployed and managed. Containers provide a lightweight, portable, and isolated environment for programs, which makes them appealing to developers and DevOps teams. Networking in Docker is critical for allowing containers to communicate with one another and with external systems. This article delves into Docker networking drivers, including their purpose, functionality, available alternatives, and best practices for choosing the proper driver to optimize container communication. The Role of Networking Drivers Networking drivers in Docker are essential components responsible for configuring the network interface of containers and connecting them to different network segments. They play a critical role in enabling communication among containers, connecting containers to external networks, and ensuring network isolation and security. The primary functions of networking drivers include: Creating Isolated Networks: Networking drivers can create isolated networks within the Docker host, enabling containers to communicate securely without interfering with one another. Bridge and Routing: They provide the bridge and routing functionality necessary to connect containers to the host network or other external networks. Custom Network Topologies: Docker networking drivers allow users to create custom network topologies, connecting containers in various ways to achieve specific communication patterns. Integration with External Networks: Networking drivers enable Docker containers to communicate with external networks, such as the Internet or on-premises networks. How Networking Drivers Work Networking drivers in Docker operate by configuring network interfaces and rules on the host system to manage the network connectivity of containers. They allow containers to connect to virtual or physical network interfaces and interact with other containers or external systems. Here’s a simplified overview of how networking drivers work: Isolation: Docker creates isolated networks for containers, ensuring that each container operates in its dedicated network namespace, preventing direct interference between containers. Routing: Networking drivers set up routing tables and firewall rules to enable containers to communicate within their respective networks and with external systems. Bridge and Overlay Networks: Networking drivers manage bridge and overlay networks that facilitate communication between containers. Bridge networks are used for communication within the host, while overlay networks allow containers to communicate across hosts. Custom Configuration: Depending on the networking driver chosen, custom configurations like IP addressing, port mapping, and network discovery can be implemented to meet specific communication requirements. Common Docker Networking Drivers Docker offers a variety of networking drivers, each with its own strengths and use cases. The choice of a networking driver can significantly impact container communication, performance, and network security. Here are some of the most commonly used Docker networking drivers: Bridge Bridge is the default Docker networking driver and is commonly used for local communication between containers on a single host. Containers connected to a bridge network can communicate with each other over the host’s internal network. The bridge network provides NAT (Network Address Translation) for container-to-host communication and basic isolation. Pros Simple to set up and use. Suitable for scenarios where containers need to communicate with each other on the same host. Provides basic network isolation. Cons Limited to communication within the host. Not ideal for multi-host communication. Host The Host network driver allows containers to share the host’s network namespace. This means that containers have full access to the host’s network stack and can communicate with external networks directly using the host’s IP address. It’s primarily used when you need maximum network performance and don’t require network isolation. Pros Highest possible network performance. Containers share the host’s network namespace, enabling access to external networks directly. Cons Minimal network isolation. Containers may conflict with ports already in use on the host. Overlay The Overlay network driver enables communication between containers running on different Docker hosts. It creates a distributed network that spans multiple hosts, making it suitable for building multi-host and multi-container applications. Overlay networks are based on the VXLAN protocol, providing encapsulation and tunneling for inter-host communication. Pros Supports communication between containers on different hosts. Scalable for multi-host environments. Provides network isolation and segmentation. Cons Requires more configuration than bridge networks. Requires network plugins for integration with third-party networking technologies. Macvlan Macvlan allows you to assign a MAC address to each container, making them appear as separate physical devices on the network. This is useful when you need containers to communicate with external networks using unique MAC and IP addresses. Macvlan is typically used in scenarios where containers need to behave like physical devices on the network. Pros Containers appear as distinct devices on the network. Useful for scenarios where containers require unique MAC addresses. Supports direct external network communication. Cons Requires careful configuration to avoid conflicts with existing network devices. Limited to Linux hosts. Ipvlan Ipvlan is a similar network driver to Macvlan but provides separate IP addresses to containers while sharing the same MAC address. Ipvlan is efficient for scenarios where multiple containers need to share a network link while having individual IP addresses. Pros Provides separate IP addresses to containers. More efficient resource usage compared to Macvlan. Supports external network communication. Cons Limited to Linux hosts. Containers share the same MAC address, which may have limitations in specific network configurations. Selecting the Right Networking Driver Choosing the right networking driver for your Docker environment is a critical decision that depends on your specific use case and requirements. Consider the following factors when making your selection: Container Communication Needs: Determine whether your containers need to communicate locally within the same host, across multiple hosts, or directly with external networks. Network Isolation: Consider the level of network isolation required for your application. Some drivers, like Bridge and Overlay, provide network segmentation and isolation, while others, like Host and Macvlan, offer less isolation. Host OS Compatibility: Ensure that the chosen networking driver is compatible with your host operating system. Some drivers are limited to Linux hosts, while others can be used in a broader range of environments. Performance and Scalability: Assess the performance characteristics of the networking driver in your specific environment. Different drivers excel in various workloads, so it’s essential to align performance with your application’s needs. Configuration Complexity: Evaluate the complexity of setting up and configuring the networking driver. Some drivers require more extensive configuration than others. Best Practices for Docker Networking Selecting the right networking driver is just the first step in optimizing Docker container communication. To ensure optimal performance, security, and network isolation, consider these best practices: Performance Considerations Monitor Network Traffic: Regularly monitor network traffic and bandwidth usage to identify bottlenecks and performance issues. Tools like iftop and netstat can help in this regard. Optimize DNS Resolution: Configure DNS resolution efficiently to reduce network latency and improve container name resolution. Use Overlay Networks for Multi-Host Communication: When building multi-host applications, use Overlay networks for efficient and secure communication between containers on different hosts. Security and Isolation Implement Network Segmentation: Use Bridge or Overlay networks for network segmentation and isolation between containers to prevent unauthorized communication. Network Policies and Firewall Rules: Define network policies and firewall rules to control container communication and enforce security measures. Regular Updates and Security Patches: Keep your Docker installation, host OS, and networking drivers up to date with the latest security patches and updates to mitigate vulnerabilities. TLS Encryption: Enable TLS (Transport Layer Security) encryption for container communication when transmitting sensitive data. Container Privileges: Limit container privileges and define user namespaces to restrict container access to the host and network resources. Conclusion Docker networking drivers are required for containers to communicate with external networks. They are critical in the creation of isolated networks, the routing of communication, and the creation of specialized network topologies. It is critical to select the correct networking driver for your Docker system to provide optimal container connectivity, performance, security, and network isolation. You can leverage the full power of Docker containers and optimize communication for your applications by knowing the strengths and limits of common Docker networking drivers and following recommended practices. Whether you’re developing single-host or multi-host applications, the networking driver you choose will be critical to the success of your containerized system.
I never moved away from Docker Desktop. For some time, after you use it to build an image, it prints a message: Plain Text What's Next? View a summary of image vulnerabilities and recommendations → docker scout quickview I decided to give it a try. I'll use the root commit of my OpenTelemetry tracing demo. Let's execute the proposed command: Shell docker scout quickview otel-catalog:1.0 Here's the result: Plain Text ✓ Image stored for indexing ✓ Indexed 272 packages Target │ otel-catalog:1.0 │ 0C 2H 15M 23L digest │ 7adfce68062e │ Base image │ eclipse-temurin:21-jre │ 0C 0H 15M 23L Refreshed base image │ eclipse-temurin:21-jre │ 0C 0H 15M 23L │ │ What's Next? View vulnerabilities → docker scout cves otel-catalog:1.0 View base image update recommendations → docker scout recommendations otel-catalog:1.0 Include policy results in your quickview by supplying an organization → docker scout quickview otel-catalog:1.0 --org <organization> Docker gives out exciting bits of information: The base image contains 15 middle-severity vulnerabilities and 23 low-severity ones The final image has an additional two high-level severity Ergo, our code introduced them! Following Scout's suggestion, we can drill down the CVEs: Shell docker scout cves otel-catalog:1.0 This is the result: Plain Text ✓ SBOM of image already cached, 272 packages indexed ✗ Detected 18 vulnerable packages with a total of 39 vulnerabilities ## Overview │ Analyzed Image ────────────────────┼────────────────────────────── Target │ otel-catalog:1.0 digest │ 7adfce68062e platform │ linux/arm64 vulnerabilities │ 0C 2H 15M 23L size │ 160 MB packages │ 272 ## Packages and Vulnerabilities 0C 1H 0M 0L org.yaml/snakeyaml 1.33 pkg:maven/org.yaml/snakeyaml@1.33 ✗ HIGH CVE-2022-1471 [Improper Input Validation] https://scout.docker.com/v/CVE-2022-1471 Affected range : <=1.33 Fixed version : 2.0 CVSS Score : 8.3 CVSS Vector : CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:H/I:H/A:L 0C 1H 0M 0L io.netty/netty-handler 4.1.100.Final pkg:maven/io.netty/netty-handler@4.1.100.Final ✗ HIGH CVE-2023-4586 [OWASP Top Ten 2017 Category A9 - Using Components with Known Vulnerabilities] https://scout.docker.com/v/CVE-2023-4586 Affected range : >=4.1.0 : <5.0.0 Fixed version : not fixed CVSS Score : 7.4 CVSS Vector : CVSS:3.1/AV:N/AC:H/PR:N/UI:N/S:U/C:H/I:H/A:N The original output is much longer, but I stopped at the exciting bit: the two high-severity CVEs; first, we see the one coming from Netty still needs to be fixed — tough luck. However, Snake YAML fixed its CVE from version 2.0 onward. I'm not using Snake YAML directly; it's a Spring dependency brought by Spring. Because of this, no guarantee exists that a major version upgrade will be compatible. But we can surely try. Let's bump the dependency to the latest version: XML <dependency> <groupId>org.yaml</groupId> <artifactId>snakeyaml</artifactId> <version>2.2</version> </dependency> We can build the image again and check that it still works. Fortunately, it does. We can execute the process again: Shell docker scout quickview otel-catalog:1.0 Lo and behold, the high-severity CVE is no more! Plain Text ✓ Image stored for indexing ✓ Indexed 273 packages Target │ local://otel-catalog:1.0-1 │ 0C 1H 15M 23L digest │ 9ddc31cdd304 │ Base image │ eclipse-temurin:21-jre │ 0C 0H 15M 23L Conclusion In this short post, we tried Docker Scout, the Docker image vulnerability detection tool. Thanks to it, we removed one high-level CVE we introduced in the code. To Go Further Docker Scout 4 Free, Easy-To-Use Tools For Docker Vulnerability Scanning
The battle for container orchestration began in the early 2010s with the rise of containerization technology. Containerization allows developers to package their applications into small, portable, self-contained units that can run on any infrastructure, from a laptop to a data center. However, managing these containers at scale can be challenging, which led to the development of container orchestration tools. The battle for dominance in container orchestration has been primarily between Kubernetes, Docker Swarm, and Mesos. Kubernetes, which was originally developed by Google, has emerged as the clear winner due to its robust feature set, large community, and strong ecosystem. Today, Kubernetes is the de facto standard for container orchestration, used by companies of all sizes and industries. As enterprises seek to modernize their application platforms, they, in turn, are adopting the use of containers and following the Kubernetes framework. It is important to note that Kubernetes development is not limited to cloud computing only. Today, Kubernetes can also run in traditional on-premises data centers, hosted and managed service providers’ infrastructure, and even at the edge. Business leaders can choose to run their applications based on productivity advantages and economics rather than be restricted based on legacy setups or availability. Indeed, as the chart below demonstrates, while the actual mix of workloads on the different platforms may shift, customers are not abandoning any of the platforms any time soon. The situation is further complicated by the adoption of multi-cloud deployments. According to IDC’s Multicloud Management Survey 2019, a full 93.2% of respondents reported the use of multiple infrastructure clouds within their organization. Increase of Kubernetes Costs As the adoption of Kubernetes continues to increase, some organizations are finding that their Kubernetes costs are getting out of control. While Kubernetes is free and open source, there are significant costs associated with running it in production. These costs include infrastructure costs for running Kubernetes clusters, licensing costs for enterprise-grade features, and operational costs for managing and maintaining Kubernetes clusters. A FINOPs/CNCF survey conducted in May 2021, ‘Finops for Kubernetes, ’ provides some very insightful details on the growth of Kubernetes costs overall. Just 12% of survey respondents saw their Kubernetes costs decrease over the last 12 months. The bulk share of these costs (over 80%) are related to compute resources. The situation is further exacerbated by the fact that most Kubernetes nodes have low utilization rates across a cluster. An insightful conclusion from the FinOps 2021 survey should be a wake-up call for IT managers. “The vast majority of respondents fell into one of two camps. Either they do not monitor Kubernetes spending at all (24%), or they rely on monthly estimates (44%). A relative minority reported more advanced, accurate, and predictive Kubernetes cost monitoring processes.” Managing Kubernetes Costs Until recently, managing Kubernetes costs have been unwieldy. Kubernetes cost optimization refers to the process of reducing the expenses associated with running Kubernetes clusters. This can involve minimizing resource usage, scaling resources efficiently, and choosing cost-effective solutions for storage, networking, and other components of the Kubernetes infrastructure. Kubernetes can be a powerful tool for scaling and managing your infrastructure, but it can also be expensive if not optimized properly. There are several approaches you can take to optimize your Kubernetes costs. One is to ensure that your workloads are running efficiently, with the right amount of resources allocated to each pod or container, and getting all these settings just right and adjusting them as workloads change is next to impossible at scale without some sort of automation. Another is to use auto-scaling features to scale resources up and down automatically based on demand. If your workload or pod usage exceeds the threshold set for a given metric, the auto scaler will increase the pod resource limits or add more pods. Likewise, if resource utilization is too low, it will scale the pods down. Types of Autoscalers Cluster Autoscaler: Autoscaler is a core component of the Kubernetes control plane that makes scheduling and scaling decisions. It detects when a pod is pending (waiting for a resource) and adjusts the number of nodes. It also identifies when nodes become redundant and reduces resource consumption. Horizontal Pod Autoscaling (HPA): The Horizontal Pod Autoscaler is a great tool for scaling stateless applications but can also be used to support scaling of StatefulSets, a Kubernetes object that manages a stateful application together with its persistent data.The HPA controller monitors the pods in the workload to determine if the number of pod replicas needs to change. HPA determines this by averaging the values of a performance metric, such as CPU utilization, for each pod. It estimates whether removing or adding pods would bring the metric’s value closer to the desired value specified in its configuration. Vertical Pod Autoscalers (VPA): VPA is a mechanism that increases or decreases CPU and memory resource requirements of a pod to match available cluster resources to actual usage. VPA only replaces pods managed by a replication controller. So, it requires the use of the Kubernetes metrics-server.A VPA deployment consists of three components: Recommender—monitor resource utilization and estimates desired values Updater—checks if the pod needs a resource limit update. Admission Controller—overrides resource requests when pods are created, using admission webhooks. Kubernetes Event-Driven Autoscaler (KEDA): KEDA is a Kubernetes-based Event Driven Autoscaler developed by Microsoft and Red Hat. With KEDA, you can drive the scaling of any container in Kubernetes based on the number of events needing to be processed.KEDA is a single-purpose and lightweight component that can be added into any Kubernetes cluster. KEDA works alongside standard Kubernetes components like the Horizontal Pod Autoscaler and can extend functionality without overwriting or duplication. With KEDA, you can explicitly map the apps you want to use on an event-driven scale, with other apps continuing to function. This makes KEDA a flexible and safe option to run alongside any number of any other Kubernetes applications or frameworks. To automatically scale a workload using predefined metrics, you might use a pod or workload auto scaler (e.g., HPA, VPA, KEDA). Pod scaling impacts the resource provisioning within a node or cluster, but this scaling approach only determines how existing resources are divided between all workloads and pods. By contrast, node scaling gives pods more resources overall by scaling up the entire cluster. With the intense focus on cost reduction these days, it is only natural that IT leaders will consider autoscaling as a way to mitigate over-provisioning, and employ a just-in-time scaling approach, based on workload needs. The Question Is: Which Approach is Best? Manual autoscaling: Any practitioner of cloud computing will admit that trying to manage resources ‘manually’ is a recipe for disaster. Unless you have a completely static environment or a very small Kubernetes deployment, it will be impossible to keep up with workload needs by scaling your pods manually. Automation: The vast majority of open-source and commercial autoscaling tools introduce some type of automation into the process of managing Kubernetes resources. Administrators will set some minimum/maximum thresholds, and the tool will automatically scale clusters or pods up/down based on resource needs and availability. This in itself is a great step forward in managing Kubernetes costs. Intelligent Autoscaling: The next paradigm in Kubernetes autoscaling is one that incorporates Intelligence and Reinforcement Learning into the automation process. Avesha’s Smart Scaler product introduces Machine Learning AI for autoscaling and was recently named one of the 100 edge computing companies to watch in 2023.Instead of using generic autoscaling templates, the patented Reinforcement Learning process actually learns the specific workload characteristics and adjusts the autoscaling process to match. It works natively with HPA and KEDA tools.This Proactive Scaling, therefore, allows IT shops to finally offer Kubernetes SLOs, something that was unheard of until now.When coupled with KubeSlice, Smart Scaler extends the autoscaling into a multi-cloud, multi-region, multi-tenant setup, providing true autoscaling across all Kubernetes infrastructure – enabling the IT manager to focus on workload optimization rather than infrastructure integration.(KubeSlice creates a flat, secure virtual network overlay for Kubernetes, eliminating north-south traffic between Kubernetes deployments.) Not only does reducing Kubernetes operating costs make good business sense, but it can also contribute to a more sustainable future by minimizing carbon footprint, creating a win-win situation for both businesses and the environment.
Does the time your CI/CD pipeline takes to deploy hold you back during development testing? This article demonstrates a faster way to develop Spring Boot microservices using a bare-metal Kubernetes cluster that runs on your own development machine. Recipe for Success This is the fourth article in a series on Ansible and Kubernetes. In the first post, I explained how to get Ansible up and running on a Linux virtual machine inside Windows. Subsequent posts demonstrated how to use Ansible to get a local Kubernetes cluster going on Ubuntu 20.04. It was tested on both native Linux- and Windows-based virtual machines running Linux. The last-mentioned approach works best when your devbox has a separate network adaptor that can be dedicated for use by the virtual machines. This article follows up on concepts used during the previous article and was tested on a cluster consisting of one control plane and one worker. As such a fronting proxy running HAProxy was not required and commented out in the inventory. The code is available on GitHub. When to Docker and When Not to Docker The secret to faster deployments to local infrastructure is to cut out on what is not needed. For instance, does one really need to have Docker fully installed to bake images? Should one push the image produced by each build to a formal Docker repository? Is a CI/CD platform even needed? Let us answer the last question first. Maven started life with both continuous integration and continuous deployment envisaged and should be able to replace a CI/CD platform such as Jenkins for local deployments. Now, it is widely known that all Maven problems can either be resolved by changing dependencies or by adding a plugin. We are not in jar-hell, so the answer must be a plugin. The Jib build plugin does just this for the sample Spring Boot microservice we will be deploying: <build> <plugins> <plugin> <groupId>com.google.cloud.tools</groupId> <artifactId>jib-maven-plugin</artifactId> <version>3.1.4</version> <configuration> <from> <image>openjdk:11-jdk-slim</image> </from> <to> <image>docker_repo:5000/rbuhrmann/hello-svc</image> <tags> <tag>latest10</tag> </tags> </to> <allowInsecureRegistries>false</allowInsecureRegistries> </configuration> </plugin> </plugins> </build> Here we see how the Jib Maven plugin is configured to bake and push the image to a private Docker repo. However, the plugin can be steered from the command line as well. This Ansible shell task loops over one or more Spring Boot microservices and does just that: - name: Git checkouts ansible.builtin.git: repo: "{{ item.git_url }" dest: "~/{{ item.name }" version: "{{ item.git_branch }" loop: "{{ apps }" **************** - name: Run JIB builds ansible.builtin.command: "mvn clean compile jib:buildTar -Dimage={{ item.name }:{{ item.namespace }" args: chdir: "~/{{ item.name }/{{ item.jib_dir }" loop: "{{ apps }" The first task clones, while the last integrates the Docker image. However, it does not push the image to a Docker repo. Instead, it dumps it as a tar ball. We are therefore halfway towards removing the Docker repo from the loop. Since our Kubernetes cluster uses Containerd, a spinout from Docker, as its container daemon, all we need is something to load the tar ball directly into Containerd. It turns out such an application exists. It is called ctr and can be steered from Ansible: - name: Load images into containerd ansible.builtin.command: ctr -n=k8s.io images import jib-image.tar args: chdir: "/home/ansible/{{ item.name }/{{ item.jib_dir }/target" register: ctr_out become: true loop: "{{ apps }" Up to this point, task execution has been on the worker node. It might seem stupid to build the image on the worker node, but keep in mind that: It concerns local testing and there will seldom be a need for more than one K8s worker - the build will not happen on more than one machine. The base image Jib builds from is smaller than the produced image that normally is pulled from a Docker repo. This results in a faster download and a negligent upload time since the image is loaded directly into the Container daemon of the worker node. The time spent downloading Git and Maven is amortized over all deployments and therefore makes up less and less percentage of time as usage increases. Bypassing a CI/CD platform such as Jenkins or Git runners shared with other applications can save significantly on build and deployment time. You Are Deployment, I Declare Up to this point, I have only shown the Ansible tasks, but the variable declarations that are ingested have not been shown. It is now an opportune time to list part of the input: apps: - name: hello1 git_url: https://github.com/jrb-s2c-github/spinnaker_tryout.git jib_dir: hello_svc image: s2c/hello_svc namespace: env1 git_branch: kustomize application_properties: application.properties: | my_name: LocalKubeletEnv1 - name: hello2 git_url: https://github.com/jrb-s2c-github/spinnaker_tryout.git jib_dir: hello_svc image: s2c/hello_svc namespace: env2 config_map_path: git_branch: kustomize application_properties: application.properties: | my_name: LocalKubeletEnv2 It concerns the DevOps characteristics of a list of Spring Boot microservices that steer Ansible to clone, integrate, deploy, and orchestrate. We already saw how Ansible handles the first three. All that remains are the Ansible tasks that create Kubernetes deployments, services, and application.properties ConfigMaps: - name: Create k8s namespaces remote_user: ansible kubernetes.core.k8s: kubeconfig: /home/ansible/.kube/config name: "{{ item.namespace }" api_version: v1 kind: Namespace state: present loop: "{{ apps }" - name: Create application.property configmaps kubernetes.core.k8s: kubeconfig: /home/ansible/.kube/config namespace: "{{ item.namespace }" state: present definition: apiVersion: v1 kind: ConfigMap metadata: name: "{{ item.name }-cm" data: "{{ item.application_properties }" loop: "{{ apps }" - name: Create deployments kubernetes.core.k8s: kubeconfig: /home/ansible/.kube/config namespace: "{{ item.namespace }" state: present definition: apiVersion: apps/v1 kind: Deployment metadata: creationTimestamp: null labels: app: "{{ item.name }" name: "{{ item.name }" spec: replicas: 1 selector: matchLabels: app: "{{ item.name }" strategy: { } template: metadata: creationTimestamp: null labels: app: "{{ item.name }" spec: containers: - image: "{{ item.name }:{{ item.namespace }" name: "{{ item.name }" resources: { } imagePullPolicy: IfNotPresent volumeMounts: - mountPath: /config name: config volumes: - configMap: items: - key: application.properties path: application.properties name: "{{ item.name }-cm" name: config status: { } loop: "{{ apps }" - name: Create services kubernetes.core.k8s: kubeconfig: /home/ansible/.kube/config namespace: "{{ item.namespace }" state: present definition: apiVersion: v1 kind: List items: - apiVersion: v1 kind: Service metadata: creationTimestamp: null labels: app: "{{ item.name }" name: "{{ item.name }" spec: ports: - port: 80 protocol: TCP targetPort: 8080 selector: app: "{{ item.name }" type: ClusterIP status: loadBalancer: {} loop: "{{ apps }" These tasks run on the control plane and configure the orchestration of two microservices using the kubernetes.core.k8s Ansible task. To illustrate how different feature branches of the same application can be deployed simultaneously to different namespaces, the same image is used. However, each is deployed with different content in its application.properties. Different Git branches can also be specified. It should be noted that nothing prevents us from deploying two or more microservices into a single namespace to provide the backend services for a modern JavaScript frontend. The imagePullPolicy is set to "IfNotPresent". Since ctr already deployed the image directly to the container runtime, there is no need to pull the image from a Docker repo. Ingress Routing Ingress instances are used to expose microservices from multiple namespaces to clients outside of the cluster. The declaration of the Ingress and its routing rules are lower down in the input declaration partially listed above: ingress: host: www.demo.io rules: - service: hello1 namespace: env1 ingress_path: /env1/hello service_path: / - service: hello2 namespace: env2 ingress_path: /env2/hello service_path: / Note that the DNS name should be under your control or not be entered as a DNS entry on a DNS server anywhere in the world. Should this be the case, the traffic might be sent out of the cluster to that IP address. The service variable should match the name of the relevant microservice in the top half of the input declaration. The ingress path is what clients should use to access the service and the service path is the endpoint of the Spring controller that should be routed to. The Ansible tasks that interpret and enforce the above declarations are: - name: Create ingress master kubernetes.core.k8s: kubeconfig: /home/ansible/.kube/config namespace: default state: present definition: apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: ingress-master annotations: nginx.org/mergeable-ingress-type: "master" spec: ingressClassName: nginx rules: - host: "{{ ingress.host }" - name: Create ingress minions kubernetes.core.k8s: kubeconfig: /home/ansible/.kube/config namespace: "{{ item.namespace }" state: present definition: apiVersion: networking.k8s.io/v1 kind: Ingress metadata: annotations: nginx.ingress.kubernetes.io/rewrite-target: " {{ item.service_path } " nginx.org/mergeable-ingress-type: "minion" name: "ingress-{{ item.namespace }" spec: ingressClassName: nginx rules: - host: "{{ ingress.host }" http: paths: - path: "{{ item.ingress_path }" pathType: Prefix backend: service: name: "{{ item.service }" port: number: 80 loop: "{{ ingress.rules }" We continue where we left off in my previous post and use Nginx Ingress Controller and MetalLB to establish Ingress routing. Once again, the use is made of the Ansible loop construct to cater to multiple routing rules. In this case, routing will proceed from the /env1/hello route to the Hello K8s Service in the env1 namespace and from the /env2/hello route to the Hello K8s Service in the env2 namespace. Routing into different namespaces is achieved using Nginx mergeable ingress types. More can be read here, but basically, one annotates Ingresses as being the master or one of the minions. Multiple instances thus combine together to allow for complex routing as can be seen above. The Ingress route can and probably will differ from the endpoint of the Spring controller(s). This certainly is the case here and a second annotation was required to change from the Ingress route to the endpoint the controller listens on: nginx.ingress.kubernetes.io/rewrite-target: " {{ item.service_path } " This is the sample controller: @RestController public class HelloController { @RequestMapping("/") public String index() { return "Greetings from " + name; } @Value(value = "${my_name}") private String name; } Since the value of the my_name field is replaced from what is defined in the application.properties and each instance of the microservice has a different value for it, we would expect a different welcome message from each of the K8S Services/Deployments. Hitting the different Ingress routes, we see this is indeed the case: On Secrets and Such It can happen that your Git repository requires token authentication. For such cases, one should add the entire git URL to the Ansible vault: apps: - name: mystery git_url: "{{ vault_git_url }" jib_dir: harvester image: s2c/harvester namespace: env1 git_branch: main application_properties: application.properties: | my_name: LocalKubeletEnv1 The content of variable vault_git_url is encrypted in all/vault.yaml and can be edited with: ansible-vault edit jetpack/group_vars/all/vault.yaml Enter the password of the vault and add/edit the URL to contain your authentication token: vault_git_url: https://AUTH TOKEN@github.com/jrb-s2c-github/demo.git Enough happens behind the scenes here to warrant an entire post. However, in short, group_vars are defined for inventory groups with the vars and vaults for each inventory group in its own sub-directory of the same name as the group. The "all" sub-folder acts as the catchall for all other managed servers that fall out of this construct. Consequently, only the "all" sub-directory is required for the master and workers groups of our inventory to use the same vault. It follows that the same approach can be followed to encrypt any secrets that should be added to the application.properties of Spring Boot. Conclusion We have seen how to make deployments of Sprint Boot microservices to local infrastructure faster by bypassing certain steps and technologies used during the CI/CD to higher environments. Multiple namespaces can be employed to allow the deployment of different versions of a micro-service architecture. Some thought will have to be given when secrets for different environments are in play though. The focus of the article is on a local environment and a description of how to use group vars to have different secrets for different environments is out of scope. It might be the topic of a future article. Please feel free to DM me on LinkedIn should you require assistance to get the rig up and running. Thank you for reading!
Artificial Intelligence (AI) and Machine Learning (ML) have revolutionized the way we approach problem-solving and data analysis. These technologies are powering a wide range of applications, from recommendation systems and autonomous vehicles to healthcare diagnostics and fraud detection. However, deploying and managing ML models in production environments can be a daunting task. This is where containerization comes into play, offering an efficient solution for packaging and deploying ML models. In this article, we'll explore the challenges of deploying ML models, the fundamentals of containerization, and the benefits of using containers for AI and ML applications. The Challenges of Deploying ML Models Deploying ML models in real-world scenarios presents several challenges. Traditionally, this process has been cumbersome and error-prone due to various factors: Dependency hell: ML models often rely on specific libraries, frameworks, and software versions. Managing these dependencies across different environments can lead to compatibility issues and version conflicts. Scalability: As the demand for AI/ML services grows, scalability becomes a concern. Ensuring that models can handle increased workloads and auto-scaling as needed can be complex. Version control: Tracking and managing different versions of ML models is crucial for reproducibility and debugging. Without proper version control, it's challenging to roll back to a previous version or track the performance of different model iterations. Portability: ML models developed on one developer's machine may not run seamlessly on another's. Ensuring that models can be easily moved between development, testing, and production environments is essential. Containerization Fundamentals Containerization addresses these challenges by encapsulating an application and its dependencies into a single package, known as a container. Containers are lightweight and isolated, making them an ideal solution for deploying AI and ML models consistently across different environments. Key containerization concepts include: Docker: Docker is one of the most popular containerization platforms. It allows you to create, package, and distribute applications as containers. Docker containers can run on any system that supports Docker, ensuring consistency across development, testing, and production. Kubernetes: Kubernetes is an open-source container orchestration platform that simplifies the management and scaling of containers. It automates tasks like load balancing, rolling updates, and self-healing, making it an excellent choice for deploying containerized AI/ML workloads. Benefits of Containerizing ML Models Containerizing ML models offer several benefits: Isolation: Containers isolate applications and their dependencies from the underlying infrastructure. This isolation ensures that ML models run consistently, regardless of the host system. Consistency: Containers package everything needed to run an application, including libraries, dependencies, and configurations. This eliminates the "it works on my machine" problem, making deployments more reliable. Portability: Containers can be easily moved between different environments, such as development, testing, and production. This portability streamlines the deployment process and reduces deployment-related issues. Scalability: Container orchestration tools like Kubernetes enable auto-scaling of ML model deployments, ensuring that applications can handle increased workloads without manual intervention. Best Practices for Containerizing AI/ML Models To make the most of containerization for AI and ML, consider these best practices: Version control: Use version control systems like Git to track changes to your ML model code. Include version information in your container images for easy reference. Dependency management: Clearly define and manage dependencies in your ML model's container image. Utilize virtual environments or container images with pre-installed libraries to ensure reproducibility. Monitoring and logging: Implement robust monitoring and logging solutions to gain insights into your containerized AI/ML applications' performance and behavior. Security: Follow security best practices when building and deploying containers. Keep container images up to date with security patches and restrict access to sensitive data and APIs. Case Studies Several organizations have successfully adopted containerization for AI/ML deployment. One notable example is Intuitive, which leverages containers and Kubernetes to manage its machine-learning infrastructure efficiently. By containerizing ML models, Intuitive can seamlessly scale its Annotations engine to millions of users while maintaining high availability. Another example is Netflix, which reported a significant reduction in deployment times and resource overheads after adopting containers for their recommendation engines. Conclusion While containerization offers numerous advantages, challenges such as optimizing resource utilization and minimizing container sprawl persist. Additionally, the integration of AI/ML with serverless computing and edge computing is an emerging trend worth exploring. In conclusion, containerization is a powerful tool for efficiently packaging and deploying ML models. It addresses the challenges associated with dependency management, scalability, version control, and portability. As AI and ML continue to shape the future of technology, containerization will play a pivotal role in ensuring reliable and consistent deployments of AI-powered applications. By embracing containerization, organizations can streamline their AI/ML workflows, reduce deployment complexities, and unlock the full potential of these transformative technologies in today's rapidly evolving digital landscape.
In the rapidly changing world of technology, DevOps is the vehicle that propels software development forward, making it agile, cost-effective, fast, and productive. This article focuses on key DevOps tools and practices, delving into the transformative power of technologies such as Docker and Kubernetes. By investigating them, I hope to shed light on what it takes to streamline processes from conception to deployment and ensure high product quality in a competitive technological race. Understanding DevOps DevOps is a software development methodology that bridges the development (Dev) and operations (Ops) teams in order to increase productivity and shorten development cycles. It is founded on principles such as continuous integration, process automation, and improving team collaboration. Adopting DevOps breaks down silos and accelerates workflows, allowing for faster iterations and faster deployment of new features and fixes. This reduces time to market, increases efficiency in software development and deployment, and improves final product quality. The Role of Automation in DevOps In DevOps, automation is the foundation of software development and delivery process optimization. It involves using tools and technologies to automatically handle a wide range of routine tasks, such as code integration, testing, deployment, and infrastructure management. Through automation, development teams get the ability to reduce human error, standardize processes, enable faster feedback and correction, improve scalability and efficiency, and bolster testing and quality assurance, eventually enhancing consistency and reliability. Several companies have successfully leveraged automation: Walmart: The retail corporation has embraced automation in order to gain ground on its retail rival, Amazon. WalmartLabs, the company's innovation arm, has implemented OneOps cloud-based technology, which automates and accelerates application deployment. As a result, the company was able to quickly adapt to changing market demands and continuously optimize its operations and customer service. Etsy: The e-commerce platform fully automated its testing and deployment processes, resulting in fewer disruptions and an enhanced user experience. Its pipeline stipulates that Etsy developers first run 4,500 unit tests, spending less than a minute on it, before checking the code into run and 7,000 automated tests. The whole process takes no more than 11 minutes to complete. These cases demonstrate how automation in DevOps not only accelerates development but also ensures stable and efficient product delivery. Leveraging Docker for Containerization Containerization, or packing an application's code with all of the files and libraries needed to run quickly and easily on any infrastructure, is one of today's most important software development processes. The leading platform that offers a comprehensive set of tools and services for containerization is Docker. It has several advantages for containerization in the DevOps pipeline: Isolation: Docker containers encapsulate an application and its dependencies, ensuring consistent operation across different computing environments. Efficiency: Containers are lightweight, reducing overhead and improving resource utilization when compared to traditional virtual machines. Portability: Docker containers allow applications to be easily moved between systems and cloud environments. Many prominent corporations leverage Docker tools and services to optimize their development cycles. Here are some examples: PayPal: The renowned online payment system embraced Docker for app development, migrating 700+ applications to Docker Enterprise and running over 200,000 containers. As a result, the company's productivity in developing, testing, and deploying applications increased by 50%. Visa: The global digital payment technology company used Docker to accelerate application development and testing by standardizing environments and streamlining operations. The Docker-based platform assisted in the processing of 100,000 transactions per day across multiple global regions six months after its implementation. Orchestrating Containers With Kubernetes Managing complex containerized applications is a difficult task that necessitates the use of a specialized tool. Kubernetes (aka K8S), an open-source container orchestration system, is one of the most popular. It organises the containers that comprise an application into logical units to facilitate management and discovery. It then automates application container distribution and scheduling across a cluster of machines, ensuring resource efficiency and high availability. Kubernetes enables easy and dynamic adjustment of application workloads, accommodating changes in demand without requiring manual intervention. This orchestration system streamlines complex tasks, allowing for more consistent and manageable deployments while optimizing resource utilization. Setting up a Kubernetes cluster entails installing Kubernetes on a set of machines, configuring networking for pods (containers), and deploying applications using Kubernetes manifests or helm charts. This procedure creates a stable environment in which applications can be easily scaled, updated, and maintained. Automating Development Workflows Continuous Integration (CI) and Continuous Deployment (CD) are critical components of DevOps software development. CI is the practice of automating the integration of code changes from multiple contributors into a single software project. It is typically implemented in such a way that it triggers an automated build with testing, with the goals of quickly detecting and fixing bugs, improving software quality, and reducing release time. After the build stage, CD extends CI by automatically deploying all code changes to a testing and/or production environment. This means that, in addition to automated testing, the release process is also automated, allowing for a more efficient and streamlined path to delivering new features and updates to users. Docker and Kubernetes are frequently used to improve efficiency and consistency in CI/CD workflows. The code is first built into a Docker container, which is then pushed to a registry in the CI stage. During the CD stage, Kubernetes retrieves the Docker container from the registry and deploys it to the appropriate environment, whether testing, staging, or production. This procedure automates deployment and ensures that the application runs consistently across all environments. Many businesses use DevOps tools to automate development cycles. Among them are: Siemens: The German multinational technology conglomerate uses GitLab's integration with Kubernetes to set up new machines in minutes. This improves software development and deployment efficiency, resulting in faster time-to-market for their products and cost savings for the company. Shopify: The Canadian e-commerce giant chose Buildkite to power its continuous integration (CI) systems due to its flexibility and ability to be used in the company's own infrastructure. Buildkite allows lightweight Buildkite agents to run in a variety of environments and is compatible with all major operating systems. Ensuring Security in DevOps Automation Lack of security in DevOps can lead to serious consequences such as data breaches, where vulnerabilities in software expose sensitive information to hackers. This can not only result in operational disruptions like system outages significantly increasing post-deployment costs but also lead to legal repercussions linked to compliance violations. Integrating security measures into the development process is thus crucial to avoid these risks. The best practices for ensuring security involve: In the case of Docker containers, using official images, scanning for vulnerabilities, implementing least privilege principles, and regularly updating containers are crucial for enhancing security. For Kubernetes clusters, it is essential to configure role-based access controls, enable network policies, and use namespace strategies to isolate resources. Here are some examples of companies handling security issues: Capital One: The American bank holding company uses DevSecOps to automate security in its CI/CD pipelines, ensuring that security checks are integrated into every stage of software development and deployment. Adobe: The American multinational computer software company has integrated security into its DevOps culture. Adobe ensures that its software products meet stringent security standards by using automated tools for security testing and compliance monitoring. Overcoming Challenges and Pitfalls Implementing DevOps and automation frequently encounters common stumbling blocks, such as resistance to change, a lack of expertise, and integration issues with existing systems. To overcome these, clear communication, training, and demonstrating the value of DevOps to all stakeholders are required. Here are some examples of how businesses overcame obstacles on their way to implementing DevOps methodology: HP: As a large established corporation, HP encountered a number of challenges in transitioning to DevOps, including organizational resistance to new development culture and tools. It relied on a "trust-based culture and a strong set of tools and processes" while taking a gradual transition approach. It started with small projects and scaled up, eventually demonstrating success in overcoming skepticism. Target: While integrating new DevOps practices, the US's seventh-largest retailer had to deal with organizational silos and technology debt accumulated over 50 years in business. It introduced a set of integration APIs that broke down departmental silos while fostering a learning and experimentation culture. They gradually improved their processes over time, resulting in successful DevOps implementation. The Future of DevOps and Automation With AI and ML taking the world by storm, these new technologies are rapidly reshaping DevOps practices. In particular, they enable the adoption of more efficient decision-making and predictive analytics, significantly optimizing the development pipeline. They also automate tasks such as code reviews, testing, and anomaly detection, which increases the speed and reliability of continuous integration and deployment processes. To prepare for the next evolution in DevOps, it's crucial to embrace trending technologies such as AI and machine learning and integrate them into your processes for enhanced automation and efficiency. This involves investing in training and upskilling teams to adapt to these new tools and methodologies. Adopting flexible architectures like microservices and leveraging data analytics for predictive insights will be key. Conclusion In this article, we have delved into the evolution of the approaches toward software development, with the DevOps methodology taking center stage in this process. DevOps is created for streamlining and optimizing development cycles through automation, containerization, and orchestration. To reach its objectives, DevOps uses powerful technologies like Docker and Kubernetes, which not only reshape traditional workflows but also ensure enhanced security and compliance. As we look towards the future, the integration of AI and ML within this realm promises further advancements, ensuring that DevOps continues to evolve, adapting to the ever-changing landscape of software development and deployment. Additional Resources Read on to learn more about this topic: The official Docker documentation; The official Kubernetes documentation; "DevOps with Kubernetes"; "DevOps: Puppet, Docker, and Kubernetes"; "Introduction to DevOps with Kubernetes"; "Docker in Action".
The most popular use case in current IT architecture is moving from Serverful to Serverless design. There are cases where we might need to design a service in a Serverful manner or move to Serverful as part of operational cost. In this article, we will be showing how to run Kumologica flow as a docker container. Usually, the applications built on Kumologica are focussed on serverless computing like AWS Lambda, Azure function, or Google function but here we will be building the service very similar to a NodeJS express app running inside a container. The Plan We will be building a simple hello world API service using a low code integration tooling and wrapping it as a docker image. We will then run the docker container using the image in our local machine. Then test the API using an external client. Prerequisites To start the development we need to have the following utilities and access ready. NodeJS installed Kumologica Designer Docker installed Implementation Building the Service First, let's start the development of Hello World service by opening the designer. To open the designer use the following command kl open. Once the designer is opened, Drag and drop an EvenListener node to the canvas. Click open the configuration and provide the below details. Plain Text Provider : NodeJS Verb : GET Path : /hello Display Name : [GET] /hello Now drag and drop a logger node from pallet to canvas and wire it after the EventListener node. Plain Text Display name : Log_Entry level : INFO Message : Inside the service Log Format : String Drag and drop the EventListenerEnd node to the canvas wire it to the Logger node and provide the following configuration. Plain Text Display Name : Success Payload : {"status" : "HelloWorld"} ContentType : application/json The flow is now completed. Let's dockerize it. Dockerizing the Flow To dockerize the flow open the project folder and place the following Docker file on the root project folder (same level as package.json). Plain Text FROM node:16-alpine WORKDIR /app COPY package*.json ./ RUN npm install ENV PATH /app/node_modules/.bin:$PATH COPY . . EXPOSE 1880 CMD ["node","index.js"] Note: The above Dockerfile is very basic and can be modified according to your needs. Now we need to add another file that treats Kumologica flow to run as an NodeJS express app. Create an index.js file with the following Javascript content. Replace the "your-flow.json" with the name of the flow.json in your project folder. JavaScript const { NodeJsFlowBuilder } = require('@kumologica/runtime'); new NodeJsFlowBuilder('your-flow.json').listen(); Now let's test the flow locally by invoking the endpoint from Postman or any REST client of your choice. curl http://localhost:1880/hello You will be getting the following response: JSON {"status" : "HelloWorld"} As we are done with our local testing, Now we will build an image based on our Docker file. To build the image, go to the root of the project folder and run the following command from a command line in Windows or a terminal in Mac. Plain Text docker build . -t hello-kl-docker-app Now the Docker image is built. Let's check the image locally by running the following command. Plain Text docker images Let's test the image running the image locally by executing the following command. Plain Text docker run -p 1880:1880 hello-kl-docker-app Check the container by running the following command: Plain Text docker ps -a You should now see the container name and ID listed. Now we are ready to push the image to any registry of your choice.
Yitaek Hwang
Software Engineer,
NYDIG
Emmanouil Gkatziouras
Cloud Architect,
egkatzioura.com
Marija Naumovska
Product Manager,
Microtica