幸运10历史结果查询官方168-168幸运10的澳洲体彩实时开奖记录查询-全国澳洲彩10官网开奖直播现场 DZone Spotlight

Saturday, March 30 View All Articles »

DZone's Cloud Native Research: Join Us for Our Survey (and $750 Raffle)!

By Caitlin Candelmo

Hi, DZone Community — we'd love for you to join us for our Cloud Native Research Survey! This year, we're combining our annual cloud and Kubernetes research into one 12-minute survey that dives further into these topics as they relate to both one another and at the intersection of security, observability, AI, and more. Our 2024 cloud native research questions cover: Microservices, container orchestration, and tools/solutions Kubernetes use cases, pain points, and security measures Cloud infrastructure, costs, tech debt, and security threats AI for release management and monitoring/observability DZone members and readers just like you drive the research that we cover in our Trend Reports, and this is where we could use your anonymous perspectives! Oh, and don't forget to enter the $750 raffle at the end of the survey! Five random people will be selected to each receive a $150 e-gift card. We're asking for ~12 minutes of your time to share your experience. Participate in Our Research Over the coming months, we will compile and analyze data from hundreds of respondents, and our observations will be featured in the "Key Research Findings" of our Cloud Native (May) and Kubernetes in the Enterprise (September) Trend Reports. Stay tuned for each report's launch and see how your insights align with the larger DZone Community. Your responses help shape the narrative of our Trend Reports, so we truly cannot do this without you. We thank you in advance for your help!—The DZone Publications team More

Enhancing Secure Software Development With ASOC Platforms

By Alex Vakulov

The rise in cybercrime, coupled with the pressing need for fresh products and the push to speed up development, is making the adoption of DevSecOps essential. Industry analysts note that about 77% of development teams are already on board with this approach. Nowadays, an increasing number of businesses are opting for Application Security Orchestration and Correlation (ASOC) within DevSecOps frameworks to ensure secure software development. ASOC-Type DevSecOps Systems DevSecOps stands out from traditional development methods by weaving security into every phase of software creation right from the start. There are many ways to adopt DevSecOps. For those looking to avoid complicated setups, the market offers ASOC-based solutions. These solutions can help companies save time, money, and labor resources while also reducing the time to market for their products. ASOC platforms enhance the effectiveness of security testing and maintain the security of software in development without delaying delivery. Gartner's Hype Cycle for Application Security, 2021, indicated that the market penetration of these solutions ranged from 5 to 20% among the intended clients. The practical uptake of this technology is low primarily because of limited awareness about its availability and benefits. ASOC solutions incorporate Application Security Testing (AST) tools into existing CI/CD pipelines, facilitating transparent and real-time collaboration between engineering teams and information security experts. These platforms offer orchestration capabilities, meaning they set up and execute security pipelines, as well as carry out correlation analysis of issues identified by AST tools, further aggregating this data for comprehensive insight. ASOC tools can generate documents and reports on security and associated business risks based on their analysis. By orchestrating and correlating within the DevSecOps framework, they handle an extensive array of data from development, testing, and security processes in real-time. This wealth of information enables a dynamic feedback loop with the platform, allowing for the intelligent oversight of the entire secure software lifecycle. Smart Control Setup Data analysis tools can be integrated into ASOC class platforms by developing an additional module dedicated to consolidating, storing, and analyzing the collected information. Here is how it is done: Gather data from software development and security scanning tools, then upload it into a dedicated data warehouse. Establish a set of metrics derived from the collected data. Incorporate business context into these metrics and identify key performance indicators (KPIs). Create dashboards to manage the DevSecOps platform using the original data, metrics, and KPIs. Artificial intelligence and machine learning are revolutionizing how we analyze collected data, enabling us to swiftly adapt to changes and refine the software delivery process. To leverage smart management of the ASOC platform, it is possible to tweak the implementation steps for the data-handling module. The initial three steps remain unchanged, but the fourth step involves employing AI and ML to process the raw data, metrics, and KPIs. This allows for the creation of dashboards that streamline the management of the DevSecOps platform based on this enhanced data analysis. Through the lens of ASOC practices, AI and ML significantly boost the efficiency of orchestration and correlation tasks. Orchestration Automated Software Quality Assurance AI within ASOC-class platforms has the smarts to dynamically set up the components and criteria needed at each checkpoint for assessing software quality, drawing from a pool of collected data and metrics. This AI-driven approach to defining quality control points lets you know if a build is primed for the next phase in its lifecycle. Leveraging AI, you can move artifacts through the DevSecOps pipeline with maximum automation. Decisions on progression are made after scanning builds in different environments, paving the way for swift and consistent releases. Automated quality control checkpoints can encompass various Application Security Testing practices. The configuration of these checkpoints can dynamically adapt depending on the stage of the security pipeline. As such, it is feasible to establish checkpoints within CI/CD pipelines and tailor their criteria, offering a powerful means to oversee and manage software quality. CI/CD Pipeline as Code For large-scale DevSecOps implementations, managing CI/CD pipelines as code presents clear benefits. Companies that adopt this strategy gain a powerful tool to enhance their software deployment, launch, management, and monitoring processes. Modern ASOC solutions enable the construction of security pipelines "out of the box" at the click of a button. AI and ML technologies improve this by automatically identifying software components and setting up CI/CD pipelines that meet exact quality standards. AI assists in cataloging software artifacts, automatically setting up end-to-end pipelines, and proactively integrating calls to information security tools, all while being guided by the context and various parameters of the product under development. AI technologies within ASOC frameworks also dynamically adjust the sequence and quantity of software quality control checkpoints within each CI/CD pipeline. This method significantly speeds up product releases, as the entire process - from the initial commit to the launch of the final version - is meticulously overseen. Correlation Application Vulnerability Correlation ASOC technologies enable the creation of an Application Vulnerability Correlation (AVC) mechanism that correlates security issues using data from software testing tools. This process involves an ML model that can automatically sift through the noise to eliminate false positives, spot duplicates, and similar security issues, and then consolidate them into a single identified defect. This mechanism significantly reduces the time needed to address security issues, allowing the team to concentrate on critical vulnerabilities and enhance the speed of threat detection in the developed software. Software Vulnerabilities Quick-Fix Guides Any set of detected issues always contains common vulnerabilities, including some critical ones, that can be fixed easily. AVC technology steps in to identify and rank information security vulnerabilities, offering automated advice on how to fix these issues. ASOC platforms collect vulnerability data from a range of security scanners, including SAST, SCA, DAST, and others. By integrating AVC technologies and providing them with comprehensive standards and detailed secure coding recommendations, it becomes possible to generate secure code templates. These templates are customized to align with the specifics of the company's DevSecOps implementation, further enhancing security measures. Security Compliance Management Simplified In software development, adhering to industry security standards and regulatory requirements is always a critical aspect. The process of managing these requirements can be fully automated within the product lifecycle, easing task execution within the company. Automated checks help ensure that all standards and requirements are met. With ASOC platforms, AI and ML technologies enable ongoing monitoring of security compliance, leveraging software quality checkpoints and predictive analytics. This monitoring provides the development team with a clear verdict on whether the developed software fulfills the necessary criteria. Evaluating the Return on Investment for ASOC Platforms Investing in ASOC platforms requires an assessment of the potential return on investment (ROI), which includes considerations of cost, time savings, and improved security. To evaluate ROI: Cost savings: Calculate the cost savings resulting from the reduced need for manual security testing and the potential reduction in security incidents and breaches. Time efficiency: Assess the time saved by automating security testing and integration within the CI/CD pipeline. Faster detection and remediation of vulnerabilities accelerate development cycles. Improved security: Consider the value of a stronger security posture, including the potential to avoid regulatory fines, protect brand reputation, and secure customer trust. Scalability: Evaluate the ability of ASOC platforms to scale with your development needs, potentially offering greater long-term value as your organization grows. Conclusion ASOC platforms are powerful tools for adopting DevSecOps, enabling companies to not only establish secure development processes but also automate them as much as possible. The integration of AI and ML significantly cuts down on manual work and speeds up the delivery of software to the market. ASOC tools are at the forefront of the DevSecOps evolution. They enable the resolution of security issues for software of any architecture and complexity without compromising delivery speed. However, not many organizations are aware of ASOC platforms. This leads many companies to stick with traditional, less scalable methods of implementing DevSecOps through isolated automation efforts. Despite this, the market already offers effective solutions that can significantly ease the workload of software professionals. ASOC platforms employing AI/ML technologies merge the analysis and management of security within existing DevOps workflows, considerably shortening the DevSecOps implementation timeline to just a few weeks. More

168澳洲幸运10基本走势开奖结果体彩 Trend Report

Enterprise AI

Artificial intelligence (AI) has continued to change the way the world views what is technologically possible. Moving from theoretical to implementable, the emergence of technologies like ChatGPT allowed users of all backgrounds to leverage the power of AI. Now, companies across the globe are taking a deeper dive into their own AI and machine learning (ML) capabilities; they’re measuring the modes of success needed to become truly AI-driven, moving beyond baseline business intelligence goals and expanding to more innovative uses in areas such as security, automation, and performance.In DZone’s Enterprise AI Trend Report, we take a pulse on the industry nearly a year after the ChatGPT phenomenon and evaluate where individuals and their organizations stand today. Through our original research that forms the “Key Research Findings” and articles written by technical experts in the DZone Community, readers will find insights on topics like ethical AI, MLOps, generative AI, large language models, and much more.

Refcard #395

Open Source Migration Practices and Patterns

By Nuwan Dias

CORE

Open Source Migration Practices and Patterns

Refcard #171

MongoDB Essentials

By Abhishek Gupta

CORE

澳洲幸运10开奖查询号码、在线开奖结果历史查询网 More Articles

Long Tests: Saving All App’s Debug Logs and Writing Your Own Logs

Let's imagine we have an app installed on a Linux server in the cloud. This app uses a list of user proxies to establish an internet connection through them and perform operations with online resources. The Problem Sometimes, the app has connection errors. These errors are common, but it's unclear whether they stem from a bug in the app, issues with the proxies, network/OS conditions on the server (where the app is running), or just specific cases that don't generate a particular error message. These errors only occur sometimes and not with every proxy but with many different ones (SSH, SOCKS, HTTP(s), with and without UDP), providing no direct clues that the proxies are the cause. Additionally, it happens at a specific time of day (but this might be a coincidence). The only information available is a brief report from a user, lacking details. Short tests across different environments with various proxies and network conditions haven’t reproduced the problem, but the user claims it still occurs. The Solution Rent the same server with the same configuration. Install the same version of the app. Run tests for 24+ hours to emulate the user's actions. Gather as much information as possible (all logs – app logs, user (test) action logs, used proxies, etc.) in a way that makes it possible to match IDs and obtain technical details in case of errors. The Task Write some tests with logs. Find a way to save all the log data. To make it more challenging, I'll introduce a couple of additional obstacles and assume limited resources and a deadline. By the way, this scenario is based on a real-world experience of mine, with slight twists and some details omitted (which are not important for the point). Testing Scripts and Your Logs I'll start with the simplest, most intuitive method for beginner programmers: when you perform actions in your scripts, you need to log specific information: Python output_file_path = "output_test_script.txt" def start(): # your function logic print(f'start: {response.content}') with open(output_file_path, "a") as file: file.write(f'uuid is {uuid} -- {response.content} \n') def stop(): # your function logic print(local_api_data_stop, local_api_stop_response.content) with open(output_file_path, "a") as file: file.write(f'{uuid} -- {response.content} \n') # your other functions and logic if __name__ == "__main__": with open(output_file_path, "w") as file: pass Continuing, you can use print statements and save information on actions, responses, IDs, counts, etc. This approach is straightforward, simple, and direct, and it will work in many cases. However, logging everything in this manner is not considered best practice. Instead, you can utilize the built-in logging module for a more structured and efficient logging approach. Python import logging # logger object logger = logging.getLogger('example') logger.setLevel(logging.DEBUG) # file handler fh = logging.FileHandler('example.log') fh.setLevel(logging.DEBUG) # formatter, set it for the handlers formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s') fh.setFormatter(formatter) ch.setFormatter(formatter) # add the handlers to the logger logger.addHandler(fh) logger.addHandler(ch) # logging messages logger.debug('a debug message') logger.info('an info message') logger.warning('a warning message') logger.error('an error message') Details. The first task is DONE! Let's consider a case where your application has a debug log feature — it rotates with 3 files, each capped at 1MB. Typically, this config is sufficient. But during extended testing sessions lasting 24hrs+, with heavy activity, you may find yourself losing valuable logs due to this configuration. To deal with this issue, you might modify the application to have larger debug log files. However, this would necessitate a new build and various other adjustments. This solution may indeed be optimal, yet there are instances where such a straightforward option isn’t available. For example, you might use a server setup with restricted access, etc. In such cases, you may need to find alternative approaches or workarounds. Using Python, you can write a script to transfer information from the debug log to a log file without size restrictions or rotation limitations. A basic implementation could be as follows: Python import time def read_new_lines(log_file, last_position): with open(log_file, 'r') as file: file.seek(last_position) new_data = file.read() new_position = file.tell() return new_data, new_position def copy_new_logs(log_file, output_log): last_position = 0 while True: new_data, last_position = read_new_lines(log_file, last_position) if new_data: with open(output_log, 'a') as output_file: output_file.write(new_data) time.sleep(1) source_log_file = 'debug.log' output_log = 'combined_log.txt' copy_new_logs(source_log_file, output_log) Now let's assume that Python isn't an option on the server for some reason — perhaps installation isn't possible due to time constraints, permission limitations, or conflicts with the operating system and you don’t know how to fix it. In such cases, using bash is the right choice: Python #!/bin/bash source_log_file="debug.log" output_log="combined_log.txt" copy_new_logs() { while true; do tail -F -n +1 "$source_log_file" >> "$output_log" sleep 1 done } trap "echo 'Interrupted! Exiting...' && exit" SIGINT copy_new_logs The second task is DONE! With your detailed logs combined with the app's logs, you now have comprehensive debug information to understand the sequence of events. This includes IDs, proxies, test data, etc along with the actions taken and the used proxies. You can run your scripts for long hours without constant supervision. The only task remaining is to analyze the debug logs to get statistics and potential info on the root cause of any issues, if they even can be replicated according to user reports. Some issues required thorough testing and detailed logging. By replicating the users’ setup and running extensive tests, we can gather important data for pinpointing bugs. Whether using Python or bash scripts (or any other PL), our focus on capturing detailed logs enables us to identify the root causes of errors and troubleshoot effectively. This highlights the importance of detailed logging in reproducing complex technical bugs and issues.

By Konstantin Sakhchinskiy

Being a Backend Developer Today Feels Harder Than 20 Years Ago

Comparing the backend development landscape of today with that of the late '90s reveals a significant shift. Despite the fact that the barriers of entry to software development as a career have become lower, the role is now more complex, with developers facing a broader range of challenges and expectations. Engineers today grapple with building larger and more intricate systems and an overwhelming amount of choice across all aspects of software development. From which language, tool, platform, framework, etc. to use, to which solution, architectural style, design pattern, etc. to implement. The demand for designing robust, scalable, and secure distributed systems capable of supporting thousands of concurrent users, often with near-perfect availability, and compliance with stringent data-handling and security regulations, adds to the complexity. This article delves into the ways backend development has evolved over the past 20 years, shedding light on the aspects that contribute to its perceived increase in difficulty. Higher User Expectations Today's computers boast exponentially greater memory and processing power, along with other previously unimaginable capabilities. These technological leaps enable the development of far more complex and powerful software. As software capabilities have increased, so too have user expectations. Modern users demand software that is not only globally accessible but also offers a seamless cross-platform experience, responsive design, and real-time updates and collaborative features. They expect exceptional performance, high availability, and continual updates to meet their evolving needs with new features and enhancements. This shift challenges developers to leverage an array of technologies to meet these expectations, making backend development even more challenging. Increased Scale and System Complexity The complexity of software problems we tackle today far surpasses those from 20 years ago. We are now orchestrating networks of computers, processing thousands of transactions per second, and scaling systems to accommodate millions of users. Developers now need to know how to handle massive, polyglot codebases, implement distributed systems, and navigate the complexities of multithreading and multiprocessing. Additionally, the necessity for effective abstraction and dependency management further complicates the development process. With complex distributed systems, abstractions are essential to allow developers to reduce complexity, hide the unnecessary details and focus on higher-level functionality. The downside of the widespread use of abstractions is that debugging is much more difficult and having a comprehensive understanding of a system much more challenging, especially due to the limitations of traditional system visualization tools. Furthermore, the proliferation of APIs necessitates meticulous dependency management to prevent the creation of systems that are convoluted, fragile, or opaque, making them challenging to understand, maintain, or expand. Although many developers still resort to whiteboards or non-interactive diagramming tools to map their systems, recently, more dynamic and automated tools have emerged, offering real-time insights into system architecture. These changes, along with many others (e.g. heightened security requirements, the introduction of caching, increased expectations for test coverage, exception handling, compiler optimization, etc.), underscore the increased complexity of modern backend development. The era when a single programmer could oversee an entire system is long gone, replaced by the need for large, distributed teams and extensive collaboration, documentation, and organizational skills. Overwhelming Choice With the rapid pace that technology is evolving, developers now have to navigate a vast and ever-growing ecosystem of programming languages, frameworks, libraries, tools, and platforms. This can lead to decision paralysis, exemplifying the paradox of choice: it is a mistake to assume that if we give developers more choice, they will be happier and more productive. Unlimited choice is more attractive in theory than in practice. The plethora of choices in the tech landscape is documented in the latest CNCF report - which shows hundreds of options! While a degree of autonomy in choosing the best technology or tool for a solution is important, too much choice can lead to overload and ultimately overwhelm people or cause procrastination or inaction. The solution is to strike a balance between providing developers with the freedom to make meaningful choices and curating the options to prevent choice overload. By offering well-vetted, purpose-driven recommendations and fostering a culture of knowledge-sharing and best practices, we empower developers to navigate the expansive tech landscape with confidence and efficiency. Different Set of Skills The advent of cloud computing has introduced additional complexities for backend developers, requiring them to be proficient in deploying and managing applications in cloud environments, understanding containerization, and selecting appropriate orchestration tools. Besides technical knowledge, skills in modern backend developers that are particularly valued are: Managing legacy software and reducing architectural technical debt. The majority of projects developers work on these days are “brown field”. Knowing how to adapt and evolve architectures to accommodate unforeseen use cases, all the while managing — and possibly reducing — architectural technical debt is a prized skill. Assembling software by choosing the right technologies. With the explosion of software-as-a-service (SaaS) and open-source software, software development has shifted to an assembly-like approach, where backend engineers need to meticulously select and combine components, libraries, and frameworks to create a complete system where each piece fits seamlessly. Designing a scalable, performant, and secure system architecture. Backend software engineers are designers too and they must possess a deep understanding of software design principles to create scalable and maintainable applications. Cross-team communication. Distributed systems are built by large teams that comprise many different stakeholders. A sign of a great engineer is the ability to communicate effectively, fostering a shared understanding and efficient decision-making across all stakeholders. Conclusion In reflecting on the evolution of backend development over the past two decades, it becomes evident that the role has transformed from a relatively straightforward task of server-side programming to a multifaceted discipline requiring a broad spectrum of skills. The challenges of meeting higher user expectations, managing the scale and complexity of systems, navigating an overwhelming array of choices, and acquiring a diverse set of skills highlights the complexity of modern backend development. While it has never been easier to enter the field of software development, excelling as a backend developer today requires navigating a more complex and rapidly evolving technological environment. Possessing expertise in system architecture, cloud services, containerization, and orchestration tools, alongside the soft skills necessary for effective cross-team communication, will remain pivotal for success in this dynamic domain.

By Thomas Johnson

Kotlin Clean Architecture and CQRS

In this article, we will implement a microservice using Clean Architecture and CQRS. The tech stack uses Kotlin, Spring WebFlux with coroutines, PostgreSQL and MongoDB, Kafka as a message broker, and the Arrow-kt functional library, which, as the documentation says, brings idiomatic functional programming to Kotlin. Clean Architecture Clean Architecture is one of the more popular software design approaches. It follows the principles of Dependency Inversion, Single Responsibility, and Separation of Concerns. It consists of concentric circles representing different layers, with the innermost layer being the most abstract and the outermost layer representing the user interface and infrastructure. By separating the concerns of the various components and enforcing the dependency rule, it becomes much easier to understand and modify the code. Depending on abstractions allows you to design your business logic flexibly without having to know the implementation details. The Domain Layer and the Application Layer are the core of the Clean Architecture. These two layers together form the application core, encapsulating the most important business rules of the system. Clean Architecture is a domain-centric architectural approach that separates business logic from technical implementation details. CQRS CQRS stands for Command and Query Responsibility Segregation, a pattern that separates reads and writes into different models, using commands to update data, and queries to read data. Using CQRS, you should have a strict separation between the write model and the read model. Those two models should be processed by separate objects and not be conceptually linked together. Those objects are not physical storage structures but are, for example, command handlers and query handlers. They’re not related to where and how the data will be stored: they’re connected to the processing behavior. Command handlers are responsible for handling commands, mutating state, or doing other side effects. Query handlers are responsible for returning the result of the requested query. They give us: Scalability, which allows for independent scaling of read and write operations Performance: By separating read and write operations, you can optimize each for performance. Reads can be optimized for fast retrieval by using denormalized data structures, caching, and specialized read models tailored to specific query needs. Flexibility allows us to model the read and write sides of the application differently, which provides flexibility in designing the data structures and processing logic to best suit the requirements of each operation. This flexibility can lead to a more efficient and maintainable system, especially in complex domains where the read and write requirements differ significantly. One of the common misconceptions about CQRS is that the commands and queries should be run on separate databases. This isn’t necessarily true, only that the behaviors and responsibilities for both should be separated. This can be within the code, within the structure of the database, or different databases. Nothing in an inner circle can know anything about something in an outer circle. In particular, the name of something declared in an outer circle must not be mentioned by the code in the inner circle. That includes functions and classes, variables, or any other named software entity. In the real world, understanding Clean Architecture can vary from person to person. Since Clean Architecture emphasizes principles such as separation of concerns, dependency inversion, and abstraction layers, different developers may interpret and implement these principles differently based on their own experiences, knowledge, and project requirements. This article shows my personal view of one of the possible ways of implementation. Ultimately, the goal of Clean Architecture is to create software systems that are maintainable, scalable, and easy to understand. Layers Presentation Layer The Presentation Layer (named api here) is the most outside layer and the entry point to our system. The most important part of the presentation layer is the controllers, which define the API endpoints in our system presented to the outside world and are responsible for: Handling interaction with the outside world Presenting, displaying, or returning responses with the data Translating the outside requests data (map requests to application layer commands) Works with framework-specific configuration setup Works on top of the application layer Let's look at the full process of command requests in the microservice. First things first: it accepts REST HTTP requests; validates input; if it's secured, checks credentials, etc.; then maps the request to the DTO the command and calls the AccountCommandService handle method. For example, let's look at creating new account and deposit balance commands methods call flow: Kotlin @Tag(name = "Accounts", description = "Account domain REST endpoints") @RestController @RequestMapping(path = ["/api/v1/accounts"]) class AccountController( private val accountCommandService: AccountCommandService, private val accountQueryService: AccountQueryService ) { @Operation( method = "createAccount", operationId = "createAccount", description = "Create new Account", responses = [ ApiResponse( description = "Create new Account", responseCode = "201", content = [Content( mediaType = MediaType.APPLICATION_JSON_VALUE, schema = Schema(implementation = AccountId::class) )] ), ApiResponse( description = "bad request response", responseCode = "400", content = [Content(schema = Schema(implementation = ErrorHttpResponse::class))] )], ) @PostMapping suspend fun createAccount( @Valid @RequestBody request: CreateAccountRequest ): ResponseEntity<out Any> = eitherScope(ctx) { accountCommandService.handle(request.toCommand()).bind() }.fold( ifLeft = { mapErrorToResponse(it) }, ifRight = { ResponseEntity.status(HttpStatus.CREATED).body(it) } ) @Operation( method = "depositBalance", operationId = "depositBalance", description = "Deposit balance", responses = [ ApiResponse( description = "Deposit balance", responseCode = "200", content = [Content( mediaType = MediaType.APPLICATION_JSON_VALUE, schema = Schema(implementation = BaseResponse::class) )] ), ApiResponse( description = "bad request response", responseCode = "400", content = [Content(schema = Schema(implementation = ErrorHttpResponse::class))] )], ) @PutMapping(path = ["/{id}/deposit"]) suspend fun depositBalance( @PathVariable id: UUID, @Valid @RequestBody request: DepositBalanceRequest ): ResponseEntity<out Any> = eitherScope(ctx) { accountCommandService.handle(request.toCommand(AccountId(id))).bind() }.fold( ifLeft = { mapErrorToResponse(it) }, ifRight = { okResponse(it) } ) } Application and Domain Layers The Application Layer contains the use cases of the application. A use case represents a specific interaction or action that the system can perform. Each use case is implemented as a command or a query. It is part of the whole application core like a Domain Layer and is responsible for: Executing the application use cases (all the actions and commands allowed to be done with the system) Fetch domain objects Manipulating domain objects The Application Layer AccountCommandService has the business logic, which runs required business rules validations, then applies changes to the domain aggregate, persists domain objects in the database, produces the domain events, and persists them in the outbox table within one single transaction. The current application used some not-required small optimization for outbox publishing. After the command service commits the transaction, we publish the event, but we don't care if this publish fails, because the polling publisher realizes that the Spring scheduler will process it anyway. Arrow greatly improves developer experience because Kotlin doesn’t ship the Either type with the standard SDK. Either is an entity whose value can be of two different types, called left and right. By convention, the right is for the success case and the left is for the error one. It allows us to express the fact that a call might return a correct value or an error, and differentiate between the two of them. The left/right naming pattern is just a convention. Either is a great way to make the error handling in your code more explicit. Making the code more explicit reduces the amount of context that you need to keep in your head, which in turn makes the code easier to understand. Kotlin interface AccountCommandService { suspend fun handle(command: CreateAccountCommand): Either<AppError, AccountId> suspend fun handle(command: ChangeAccountStatusCommand): Either<AppError, Unit> suspend fun handle(command: ChangeContactInfoCommand): Either<AppError, Unit> suspend fun handle(command: DepositBalanceCommand): Either<AppError, Unit> suspend fun handle(command: WithdrawBalanceCommand): Either<AppError, Unit> suspend fun handle(command: UpdatePersonalInfoCommand): Either<AppError, Unit> } @Service class AccountCommandServiceImpl( private val accountRepository: AccountRepository, private val outboxRepository: OutboxRepository, private val tx: TransactionalOperator, private val eventPublisher: EventPublisher, private val serializer: Serializer, private val emailVerifierClient: EmailVerifierClient, private val paymentClient: PaymentClient ) : AccountCommandService { override suspend fun handle(command: CreateAccountCommand): Either<AppError, AccountId> = eitherScope(ctx) { emailVerifierClient.verifyEmail(command.contactInfo.email).bind() val (account, event) = tx.executeAndAwait { val account = accountRepository.save(command.toAccount()).bind() val event = outboxRepository.insert(account.toAccountCreatedOutboxEvent(serializer)).bind() account to event } publisherScope.launch { publishOutboxEvent(event) } account.accountId } override suspend fun handle(command: DepositBalanceCommand): Either<AppError, Unit> = eitherScope(ctx) { paymentClient.verifyPaymentTransaction(command.accountId.string(), command.transactionId).bind() val event = tx.executeAndAwait { val foundAccount = accountRepository.getById(command.accountId).bind() foundAccount.depositBalance(command.balance).bind() val account = accountRepository.update(foundAccount).bind() val event = account.toBalanceDepositedOutboxEvent(command.balance, serializer) outboxRepository.insert(event).bind() } publisherScope.launch { publishOutboxEvent(event) } } } The Domain Layer encapsulates the most important business rules of the system. It is the place where we have to start building core business rules. In the domain-centric architecture, we start developing from the domain. The responsibilities of the Domain Layer are as follows: Defining domain models Defining rules, domain, and business errors Executing the application business logic Enforcing the business rules Domain models have data and behavior and represent the domain. We have two approaches for designing: rich and anemic domain models. Anemic models allow external manipulation of our data, and it's usually antipattern because the domain object itself doesn't control its own data. Rich domain models contain both data and behavior. The richer the behavior, the richer the domain model. It exposes only a specific set of public methods, which allows manipulation of data only in the way the domain approves, encapsulates logic, and does validations. Rich domain model properties are read-only by default. Domain models can be always valid or not; it's better to prefer always-valid domain models. At any point in time when we're working with domain state, we know it's valid and don't need to write additional validations to check it. Always-valid domain models mean they are in a valid state all the time. One more important detail is Persistence Ignorance - modeling the domain without taking into account how domain objects will be persisted. Kotlin class Account( val accountId: AccountId = AccountId(), ) { var contactInfo: ContactInfo = ContactInfo() private set var personalInfo: PersonalInfo = PersonalInfo() private set var address: Address = Address() private set var balance: Balance = Balance() private set var status: AccountStatus = AccountStatus.FREE private set var version: Long = 0 private set var updatedAt: Instant? = null private set var createdAt: Instant? = null private set fun depositBalance(newBalance: Balance): Either<AppError, Account> = either { if (balance.balanceCurrency != newBalance.balanceCurrency) raise(InvalidBalanceCurrency("invalid currency: $newBalance")) if (newBalance.amount < 0) raise(InvalidBalanceAmount("invalid balance amount: $newBalance")) balance = balance.copy(amount = (balance.amount + newBalance.amount)) updatedAt = Instant.now() this@Account } fun withdrawBalance(newBalance: Balance): Either<AppError, Account> = either { if (balance.balanceCurrency != newBalance.balanceCurrency) raise(InvalidBalanceCurrency("invalid currency: $newBalance")) if (newBalance.amount < 0) raise(InvalidBalanceAmount("invalid balance amount: $newBalance")) val newAmount = (balance.amount - newBalance.amount) if ((newAmount) < 0) raise(InvalidBalanceError("invalid balance: $newBalance")) balance = balance.copy(amount = newAmount) updatedAt = Instant.now() this@Account } fun updateStatus(newStatus: AccountStatus): Either<AppError, Account> = either { status = newStatus updatedAt = Instant.now() this@Account } fun changeContactInfo(newContactInfo: ContactInfo): Either<AppError, Account> = either { contactInfo = newContactInfo updatedAt = Instant.now() this@Account } fun changeAddress(newAddress: Address): Either<AppError, Account> = either { address = newAddress updatedAt = Instant.now() this@Account } fun changePersonalInfo(newPersonalInfo: PersonalInfo): Either<AppError, Account> = either { personalInfo = newPersonalInfo updatedAt = Instant.now() this@Account } fun incVersion(amount: Long = 1): Either<AppError, Account> = either { if (amount < 1) raise(InvalidVersion("invalid version: $amount")) version += amount updatedAt = Instant.now() this@Account } fun withVersion(amount: Long = 1): Account { version = amount updatedAt = Instant.now() return this } fun decVersion(amount: Long = 1): Either<AppError, Account> = either { if (amount < 1) raise(InvalidVersion("invalid version: $amount")) version -= amount updatedAt = Instant.now() this@Account } fun withUpdatedAt(newValue: Instant): Account { updatedAt = newValue return this } } Infrastructure Layer Next is the Infrastructure Layer, which contains implementations for external-facing services and is responsible for: Interacting with the persistence solution Interacting with other services (HTTP or gRPC clients, message brokers, etc.) Actual implementations of the interfaces from the application layer Identity concerns At the Infrastructure Layer, we have implementations of the Application Layer interfaces. The main write database used PostgreSQL with r2dbc reactive driver, and DatabaseClient with raw SQL queries. If we want to use an ORM entity, we still pass domain objects through the other layer interfaces anyway, and then inside the repository implementation, code map to the ORM entities. For this project, keep Spring annotations as is; but if we want cleaner implementation, it's possible to move them to another layer. In this example, the project SQL schema is simplified and not normalized. Kotlin interface AccountRepository { suspend fun getById(id: AccountId): Either<AppError, Account> suspend fun save(account: Account): Either<AppError, Account> suspend fun update(account: Account): Either<AppError, Account> } @Repository class AccountRepositoryImpl( private val dbClient: DatabaseClient ) : AccountRepository { override suspend fun save(account: Account): Either<AppError, Account> = eitherScope<AppError, Account>(ctx) { dbClient.sql(INSERT_ACCOUNT_QUERY.trimMargin()) .bindValues(account.withVersion(FIRST_VERSION).toPostgresEntityMap()) .fetch() .rowsUpdated() .awaitSingle() account } override suspend fun update(account: Account): Either<AppError, Account> = eitherScope(ctx) { dbClient.sql(OPTIMISTIC_UPDATE_QUERY.trimMargin()) .bindValues(account.withUpdatedAt(Instant.now()).toPostgresEntityMap(withOptimisticLock = true)) .fetch() .rowsUpdated() .awaitSingle() account.incVersion().bind() } override suspend fun getById(id: AccountId): Either<AppError, Account> = eitherScope(ctx) { dbClient.sql(GET_ACCOUNT_BY_ID_QUERY.trimMargin()) .bind(ID_FIELD, id.id) .map { row, _ -> row.toAccount() } .awaitSingleOrNull() ?: raise(AccountNotFoundError("account for id: $id not found")) } } Below is an important detail about outbox repository realization: To be able to handle the case of multiple pod instances processing in parallel outbox table, of course, we have idempotent consumers. However, as we can, we have to avoid processing the same table events more than one time. To prevent multiple instances from selecting and publishing the same events, we use FOR UPDATE SKIP LOCKED. This combination does the next thing: When one instance tries to select a batch of outbox events, if some other instance already selected these records, first, one will skip locked records and select the next available and not locked, and so on. But again, it's only my personal preferred way of implementation. The use of only polling publishers is usually the default one. As a possible alternative, use Debezium (for example), but it's up to you. Kotlin interface OutboxRepository { suspend fun insert(event: OutboxEvent): Either<AppError, OutboxEvent> suspend fun deleteWithLock( event: OutboxEvent, callback: suspend (event: OutboxEvent) -> Either<AppError, Unit> ): Either<AppError, OutboxEvent> suspend fun deleteEventsWithLock( batchSize: Int, callback: suspend (event: OutboxEvent) -> Either<AppError, Unit> ): Either<AppError, Unit> } @Component class OutboxRepositoryImpl( private val dbClient: DatabaseClient, private val tx: TransactionalOperator ) : OutboxRepository { override suspend fun insert(event: OutboxEvent): Either<AppError, OutboxEvent> = eitherScope(ctx) { dbClient.sql(INSERT_OUTBOX_EVENT_QUERY.trimMargin()) .bindValues(event.toPostgresValuesMap()) .map { row, _ -> row.get(ROW_EVENT_ID, String::class.java) } .one() .awaitSingle() .let { event } } override suspend fun deleteWithLock( event: OutboxEvent, callback: suspend (event: OutboxEvent) -> Either<AppError, Unit> ): Either<AppError, OutboxEvent> = eitherScope { tx.executeAndAwait { dbClient.sql(GET_OUTBOX_EVENT_BY_ID_FOR_UPDATE_SKIP_LOCKED_QUERY.trimMargin()) .bindValues(mutableMapOf(EVENT_ID to event.eventId)) .map { row, _ -> row.get(ROW_EVENT_ID, String::class.java) } .one() .awaitSingleOrNull() callback(event).bind() deleteOutboxEvent(event).bind() event } } override suspend fun deleteEventsWithLock( batchSize: Int, callback: suspend (event: OutboxEvent) -> Either<AppError, Unit> ): Either<AppError, Unit> = eitherScope(ctx) { tx.executeAndAwait { dbClient.sql(GET_OUTBOX_EVENTS_FOR_UPDATE_SKIP_LOCKED_QUERY.trimMargin()) .bind(LIMIT, batchSize) .map { row, _ -> row.toOutboxEvent() } .all() .asFlow() .onStart { log.info { "start publishing outbox events batch: $batchSize" } } .onEach { callback(it).bind() } .onEach { event -> deleteOutboxEvent(event).bind() } .onCompletion { log.info { "completed publishing outbox events batch: $batchSize" } } .collect() } } private suspend fun deleteOutboxEvent(event: OutboxEvent): Either<AppError, Long> = eitherScope(ctx) { dbClient.sql(DELETE_OUTBOX_EVENT_BY_ID_QUERY) .bindValues(mutableMapOf(EVENT_ID to event.eventId)) .fetch() .rowsUpdated() .awaitSingle() } } The polling publisher implementation is a scheduled process that does the same job for publishing and deleting events at the given interval, as typed earlier, and uses the same service method: Kotlin @Component @ConditionalOnProperty(prefix = "schedulers", value = ["outbox.enable"], havingValue = "true") class OutboxScheduler( private val outboxRepository: OutboxRepository, private val publisher: EventPublisher, ) { @Value("\${schedulers.outbox.batchSize}") private var batchSize: Int = 30 @Scheduled( initialDelayString = "\${schedulers.outbox.initialDelayMillis}", fixedRateString = "\${schedulers.outbox.fixedRate}" ) fun publishOutboxEvents() = runBlocking { eitherScope { outboxRepository.deleteEventsWithLock(batchSize) { publisher.publish(it) }.bind() }.fold( ifLeft = { err -> log.error { "error while publishing scheduler outbox events: $err" } }, ifRight = { log.info { "outbox scheduler published events" } } ) } } A domain event is something interesting from a business point of view that happened within the system; something that already occurred. We're capturing the fact something happened with the system. After events have been published from the outbox table to the broker, in this application, it consumes them from Kafka, and the consumers themselves call EventHandlerService methods, which builds a read model for our domain aggregates. The read model of a CQRS-based system provides materialized views of the data, typically as highly denormalized views. These views are tailored to the interfaces and display requirements of the application, which helps to maximize both display and query performance. For error handling and retry, messages prefer to use separate retry topics and listeners. Using the stream of events as the write store rather than the actual data at a point in time avoids update conflicts on a single aggregate and maximizes performance and scalability. The events can be used to asynchronously generate materialized views of the data that are used to populate the read store. As with any system where the write and read stores are separate, systems based on this pattern are only eventually consistent. There will be some delay between the event being generated and the data store being updated. Here is Kafka consumer implementation: Kotlin @Component class BalanceDepositedEventConsumer( private val eventProcessor: EventProcessor, private val kafkaTopics: KafkaTopics ) { @KafkaListener( groupId = "\${kafka.consumer-group-id:account_microservice_group_id}", topics = ["\${topics.accountBalanceDeposited.name}"], ) fun process(ack: Acknowledgment, record: ConsumerRecord<String, ByteArray>) = eventProcessor.process( ack = ack, consumerRecord = record, deserializationClazz = BalanceDepositedEvent::class.java, onError = eventProcessor.errorRetryHandler(kafkaTopics.accountBalanceDepositedRetry.name, DEFAULT_RETRY_COUNT) ) { event -> eventProcessor.on( ack = ack, consumerRecord = record, event = event, retryTopic = kafkaTopics.accountBalanceDepositedRetry.name ) } @KafkaListener( groupId = "\${kafka.consumer-group-id:account_microservice_group_id}", topics = ["\${topics.accountBalanceDepositedRetry.name}"], ) fun processRetry(ack: Acknowledgment, record: ConsumerRecord<String, ByteArray>) = eventProcessor.process( ack = ack, consumerRecord = record, deserializationClazz = BalanceDepositedEvent::class.java, onError = eventProcessor.errorRetryHandler(kafkaTopics.accountBalanceDepositedRetry.name, DEFAULT_RETRY_COUNT) ) { event -> eventProcessor.on( ack = ack, consumerRecord = record, event = event, retryTopic = kafkaTopics.accountBalanceDepositedRetry.name ) } } At the Application Layer, AccountEventsHandlerService is implemented in the following way: Kotlin interface AccountEventHandlerService { suspend fun on(event: AccountCreatedEvent): Either<AppError, Unit> suspend fun on(event: BalanceDepositedEvent): Either<AppError, Unit> suspend fun on(event: BalanceWithdrawEvent): Either<AppError, Unit> suspend fun on(event: PersonalInfoUpdatedEvent): Either<AppError, Unit> suspend fun on(event: ContactInfoChangedEvent): Either<AppError, Unit> suspend fun on(event: AccountStatusChangedEvent): Either<AppError, Unit> } @Component class AccountEventHandlerServiceImpl( private val accountProjectionRepository: AccountProjectionRepository ) : AccountEventHandlerService { override suspend fun on(event: AccountCreatedEvent): Either<AppError, Unit> = eitherScope(ctx) { accountProjectionRepository.save(event.toAccount()).bind() } override suspend fun on(event: BalanceDepositedEvent): Either<AppError, Unit> = eitherScope(ctx) { findAndUpdateAccountById(event.accountId, event.version) { account -> account.depositBalance(event.balance).bind() }.bind() } private suspend fun findAndUpdateAccountById( accountId: AccountId, eventVersion: Long, block: suspend (Account) -> Account ): Either<AppError, Account> = eitherScope(ctx) { val foundAccount = findAndValidateVersion(accountId, eventVersion).bind() val accountForUpdate = block(foundAccount) accountProjectionRepository.update(accountForUpdate).bind() } private suspend fun findAndValidateVersion( accountId: AccountId, eventVersion: Long ): Either<AppError, Account> = eitherScope(ctx) { val foundAccount = accountProjectionRepository.getById(accountId).bind() validateVersion(foundAccount, eventVersion).bind() foundAccount } } The infrastructure layer read model repository uses MongoDB's Kotlin coroutines driver: Kotlin interface AccountProjectionRepository { suspend fun save(account: Account): Either<AppError, Account> suspend fun update(account: Account): Either<AppError, Account> suspend fun getById(id: AccountId): Either<AppError, Account> suspend fun getByEmail(email: String): Either<AppError, Account> suspend fun getAll(page: Int, size: Int): Either<AppError, AccountsList> suspend fun upsert(account: Account): Either<AppError, Account> } Kotlin @Component class AccountProjectionRepositoryImpl( mongoClient: MongoClient, ) : AccountProjectionRepository { private val accountsDB = mongoClient.getDatabase(ACCOUNTS_DB) private val accountsCollection = accountsDB.getCollection<AccountDocument>(ACCOUNTS_COLLECTION) override suspend fun save(account: Account): Either<AppError, Account> = eitherScope<AppError, Account>(ctx) { val insertResult = accountsCollection.insertOne(account.toDocument()) log.info { "account insertOneResult: $insertResult, account: $account" } account } override suspend fun update(account: Account): Either<AppError, Account> = eitherScope(ctx) { val filter = and(eq(ACCOUNT_ID, account.accountId.string()), eq(VERSION, account.version)) val options = FindOneAndUpdateOptions().upsert(false).returnDocument(ReturnDocument.AFTER) accountsCollection.findOneAndUpdate( filter, account.incVersion().bind().toBsonUpdate(), options ) ?.toAccount() ?: raise(AccountNotFoundError("account with id: ${account.accountId} not found")) } override suspend fun upsert(account: Account): Either<AppError, Account> = eitherScope(ctx) { val filter = and(eq(ACCOUNT_ID, account.accountId.string())) val options = FindOneAndUpdateOptions().upsert(true).returnDocument(ReturnDocument.AFTER) accountsCollection.findOneAndUpdate( filter, account.toBsonUpdate(), options ) ?.toAccount() ?: raise(AccountNotFoundError("account with id: ${account.accountId} not found")) } override suspend fun getById(id: AccountId): Either<AppError, Account> = eitherScope(ctx) { accountsCollection.find<AccountDocument>(eq(ACCOUNT_ID, id.string())) .firstOrNull() ?.toAccount() ?: raise(AccountNotFoundError("account with id: $id not found")) } override suspend fun getByEmail(email: String): Either<AppError, Account> = eitherScope(ctx) { val filter = and(eq(CONTACT_INFO_EMAIL, email)) accountsCollection.find(filter).firstOrNull()?.toAccount() ?: raise(AccountNotFoundError("account with email: $email not found")) } override suspend fun getAll( page: Int, size: Int ): Either<AppError, AccountsList> = eitherScope<AppError, AccountsList>(ctx) { parZip(coroutineContext, { accountsCollection.find() .skip(page * size) .limit(size) .map { it.toAccount() } .toList() }, { accountsCollection.find().count() }) { list, totalCount -> AccountsList( page = page, size = size, totalCount = totalCount, accountsList = list ) } } } Read queries' way through the layers is very similar: we accept HTTP requests at the API layer: Kotlin @Tag(name = "Accounts", description = "Account domain REST endpoints") @RestController @RequestMapping(path = ["/api/v1/accounts"]) class AccountController( private val accountCommandService: AccountCommandService, private val accountQueryService: AccountQueryService ) { @Operation( method = "getAccountByEmail", operationId = "getAccountByEmail", description = "Get account by email", responses = [ ApiResponse( description = "Get account by email", responseCode = "200", content = [Content( mediaType = MediaType.APPLICATION_JSON_VALUE, schema = Schema(implementation = AccountResponse::class) )] ), ApiResponse( description = "bad request response", responseCode = "400", content = [Content(schema = Schema(implementation = ErrorHttpResponse::class))] )], ) @GetMapping(path = ["/email/{email}"]) suspend fun getAccountByEmail( @PathVariable @Email @Size( min = 6, max = 255 ) email: String ): ResponseEntity<out Any> = eitherScope(ctx) { accountQueryService.handle(GetAccountByEmailQuery(email)).bind() }.fold( ifLeft = { mapErrorToResponse(it) }, ifRight = { ResponseEntity.ok(it.toResponse()) } ) } Application Layer AccountQueryService methods: Kotlin interface AccountQueryService { suspend fun handle(query: GetAccountByIdQuery): Either<AppError, Account> suspend fun handle(query: GetAccountByEmailQuery): Either<AppError, Account> suspend fun handle(query: GetAllAccountsQuery): Either<AppError, AccountsList> } Kotlin @Service class AccountQueryServiceImpl( private val accountRepository: AccountRepository, private val accountProjectionRepository: AccountProjectionRepository ) : AccountQueryService { override suspend fun handle(query: GetAccountByIdQuery): Either<AppError, Account> = eitherScope(ctx) { accountRepository.getById(query.id).bind() } override suspend fun handle(query: GetAccountByEmailQuery): Either<AppError, Account> = eitherScope(ctx) { accountProjectionRepository.getByEmail(query.email).bind() } override suspend fun handle(query: GetAllAccountsQuery): Either<AppError, AccountsList> = eitherScope(ctx) { accountProjectionRepository.getAll(page = query.page, size = query.size).bind() } } And it uses PostgreSQL or MongoDB repositories to get the data depending on the query use case: Kotlin @Component class AccountProjectionRepositoryImpl( mongoClient: MongoClient, ) : AccountProjectionRepository { private val accountsDB = mongoClient.getDatabase(ACCOUNTS_DB) private val accountsCollection = accountsDB.getCollection<AccountDocument>(ACCOUNTS_COLLECTION) override suspend fun getByEmail(email: String): Either<AppError, Account> = eitherScope(ctx) { val filter = and(eq(CONTACT_INFO_EMAIL, email)) accountsCollection.find(filter) .firstOrNull() ?.toAccount() ?: raise(AccountNotFoundError("account with email: $email not found")) } } Final Thoughts In real-world applications, we have to implement many more necessary features, like K8s health checks, circuit breakers, rate limiters, etc., so this project is simplified for demonstration purposes. The source code is in my GitHub, please star it if is helpful and useful for you. For feedback or questions, feel free to contact me!

By Alexander Bryksin

How To Get Started With New Pattern Matching in Java 21

Java 21 just got simpler! Want to write cleaner, more readable code? Dive into pattern matching, a powerful new feature that lets you easily deconstruct and analyze data structures. This article will explore pattern matching with many examples, showing how it streamlines normal data handling and keeps your code concise. Examples of Pattern Matching Pattern matching shines in two key areas. First, the pattern matching feature of switch statements replaces the days of long chains of if statements, letting you elegantly match the selector expression against various data types, including primitives, objects, and even null. Secondly, what if you need to check an object's type and extract specific data? The pattern matching feature of instance expressions simplifies this process which allows you to confirm if an object matches a pattern and, if so, conveniently extract the desired data. Let’s take a look at more examples of pattern matching in Java code. Pattern Matching With Switch Statements Java public static String getAnimalSound(Animal animal) { return switch (animal) { case Dog dog -> "woof"; case Cat cat -> "meow"; case Bird bird -> "chirp"; case null -> "No animal found!"; default -> "Unknown animal sound"; }; } Matches selector expressions with types other than integers and strings Uses type patterns (case Dog dog) to check and cast types simultaneously Handles null directly within the switch block (case null) Employs arrow syntax (->) for concise body expressions Pattern Matching With instanceof Java if (object instanceof String str) { System.out.println("The string is: " + str); } else if (object instanceof Integer num) { System.out.println("The number is: " + num); } else { System.out.println("Unknown object type"); } Combines type checking and casting in a single expression Introduces a pattern variable (str, num) to capture the object's value. Avoids explicit casting (String str = (String) object). Pattern Matching With Primitive Types Java int number = 10; switch (number) { case 10: System.out.println("The number is 10."); break; case 20: System.out.println("The number is 20."); break; case 30: System.out.println("The number is 30."); break; default: System.out.println("The number is something else."); } Pattern matching with primitive types doesn't introduce entirely new functionality but rather simplifies existing practices when working with primitives in switch statements. Pattern Matching With Reference Types Java String name = "Daniel Oh"; switch (name) { case "Daniel Oh": System.out.println("Hey, Daniel!"); break; case "Jennie Oh": System.out.println("Hola, Jennie!"); break; default: System.out.println("What’s up!"); } Pattern matching with reference types makes code easier to understand and maintain due to its clear and concise syntax. By combining type checking and extraction in one step, pattern matching reduces the risk of errors associated with explicit casting. More expressive switch statements: Switch statements become more versatile and can handle a wider range of data types and scenarios. Pattern Matching With null Java Object obj = null; switch (obj) { case null: System.out.println("The object is null."); break; default: System.out.println("The object is not null."); } Before Java 21, switch statements would throw a NullPointerException if the selector expression was null. Pattern matching allows a dedicated case null clause to handle this scenario gracefully. By explicitly checking for null within the switch statement, you avoid potential runtime errors and ensure your code is more robust. Having a dedicated case null clause makes the code's intention clearer compared to needing an external null check before the switch. Java's implementation is designed not to break existing code. If a switch statement doesn't have a case null clause, it will still throw a NullPointerException as before, even if a default case exists. Pattern Matching With Multiple Patterns Java List<String> names = new ArrayList<>(); names.add("Daniel Oh"); names.add("Jennie Oh"); for (String name : names) { switch (name) { case "Daniel Oh", "Jennie Oh": System.out.println("Hola, " + name + "!"); break; default: System.out.println("What’s up!"); } } Unlike traditional switch statements, pattern matching considers the order of cases. The first case with a matching pattern is executed. Avoid unreachable code by ensuring subtypes don't appear before their supertypes in the pattern-matching cases. Extracting Data From Patterns Java List<Integer> numbers = new ArrayList<>(); numbers.add(10); numbers.add(20); numbers.add(30); for (Integer number : numbers) { switch (number) { case Dog(String name): System.out.println("Hello, " + name + "!"); break; default: System.out.println("Hello stranger!"); } } In the last example, we are using pattern matching to extract the dog's name from the Dog object. Conclusion Pattern matching is a powerful new feature in Java 21 that can make your code more concise and readable. It is especially useful for working with complex data structures with key benefits: Improved readability: Pattern matching makes code more readable by combining type checking, data extraction, and control flow into a single statement. This eliminates the need for verbose if-else chains and explicit casting. Conciseness: Code becomes more concise by leveraging pattern matching's ability to handle multiple checks and extractions in a single expression. This reduces boilerplate code and improves maintainability. Enhanced type safety: Pattern matching enforces type safety by explicitly checking and potentially casting the data type within the switch statement or instance expression. This reduces the risk of runtime errors caused by unexpected object types. Null handling: Pattern matching allows for the explicit handling of null cases directly within the switch statement. This eliminates the need for separate null checks before the switch, improving code flow and reducing the chance of null pointer exceptions. Flexibility: Pattern matching goes beyond basic types. It can handle complex data structures using record patterns (introduced in Java 14). This allows for more expressive matching logic for intricate data objects. Modern look and feel: Pattern matching aligns with modern functional programming paradigms, making Java code more expressive and aligned with other languages that utilize this feature. Overall, pattern matching in Java 21 streamlines data handling, improves code clarity and maintainability, and enhances type safety for a more robust and developer-friendly coding experience.

By Daniel Oh

CORE

Monitoring Generative AI Applications in Production

Generative AI (GenAI) and Large Language Models (LLMs) offer transformative potential across various industries. However, their deployment in production environments faces challenges due to their computational intensity, dynamic behavior, and the potential for inaccurate or undesirable outputs. Existing monitoring tools often fall short of providing real-time insights crucial for managing such applications. Building on top of the existing work, this article presents the framework for monitoring GenAI applications in production. It addresses both infrastructure and quality aspects. On the infrastructure side, one needs to proactively track performance metrics such as cost, latency, and scalability. This enables informed resource management and proactive scaling decisions. To ensure quality and ethical use, the framework recommends real-time monitoring for hallucinations, factuality, bias, coherence, and sensitive content generation. The integrated approach empowers developers with immediate alerts and remediation suggestions, enabling swift intervention and mitigation of potential issues. By combining performance and content-oriented monitoring, this framework fosters the stable, reliable, and ethical deployment of Generative AI within production environments. Introduction The capabilities of GenAI, driven by the power of LLMs, are rapidly transforming the way we interact with technology. From generating remarkably human-like text to creating stunning visuals, GenAI applications are finding their way into diverse production environments. Industries are harnessing this potential for use cases such as content creation, customer service chatbots, personalized marketing, and even code generation. However, the path from promising technology to operationalizing these models remains a big challenge[1]. Ensuring the optimal performance of GenAI applications demands careful management of infrastructure costs associated with model inference, cost, and proactive scaling measures to handle fluctuations in demand. Maintaining user experience requires close attention to response latency. Simultaneously, the quality of the output generated by LLMs is of utmost importance. Developers must grapple with the potential for factual errors, the presence of harmful biases, and the possibility of the models generating toxic or sensitive content. These challenges necessitate a tailored approach to monitoring that goes beyond traditional tools. The need for real-time insights into both infrastructure health and output quality is essential for the reliable and ethical use of GenAI applications in production. This article addresses this critical need by proposing solutions specifically for real-time monitoring of GenAI applications in production. Current Limitations The monitoring and governance of AI systems have garnered significant attention in recent years. Existing literature on AI model monitoring often focuses on supervised learning models [2]. These approaches address performance tracking, drift detection, and debugging in classification or regression tasks. Research in explainable AI (XAI) has also yielded insights into interpreting model decisions, particularly for black-box models [3]. This field seeks to unravel the inner workings of these complex systems or provide post-hoc justifications for outputs [4]. Moreover, studies on bias detection explore techniques for identifying and mitigating discriminatory patterns that may arise from training data or model design [5]. While these fields provide a solid foundation, they do not fully address the unique challenges of monitoring and evaluating generative AI applications based on LLMs. Here, the focus shifts away from traditional classification or regression metrics and towards open-ended generation. Evaluating LLMs often involves specialized techniques like human judgment or comparison against reference datasets [6]. Furthermore, standard monitoring and XAI solutions may not be optimized for tracking issues prevalent in GenAI, such as hallucinations, real-time bias detection, or sensitivity to token usage and cost. There has been some recent work in helping solve this challenge [8], [9]. This article builds upon prior work in these related fields while proposing a framework designed specifically for the real-time monitoring needs of production GenAI applications. It emphasizes the integration of infrastructure and quality monitoring, enabling the timely detection of a broad range of potential issues unique to LLM-based applications. This article concentrates on monitoring Generative AI applications utilizing model-as-a-service (MLaaS) offerings such as Google Cloud's Gemini, OpenAI's GPTs, Claude on Amazon Bedrock, etc. While the core monitoring principles remain applicable, self-hosted LLMs necessitate additional considerations. These include model optimization, accelerator (e.g. GPU) management, infrastructure management, scaling, etc - factors outside the scope of this discussion. Also, this article focuses on text-to-text models, but the principles can be extended to other modalities as well. The subsequent sections will focus on various metrics, techniques, and architecture for capturing those metrics to gain visibility into LLM's behavior in production. Application Monitoring Monitoring the performance and resource utilization of Generative AI applications is vital for ensuring their optimal functioning and cost-effectiveness in production environments. This section delves into the key components of application monitoring for GenAI, specifically focusing on cost, latency, and scalability considerations. Cost Monitoring and Optimization The cost associated with deploying GenAI applications can be significant, especially when leveraging MLaaS offerings. Therefore, granular cost monitoring and optimization are crucial. Below are some of the key metrics to focus on: Granular Cost Tracking MLaaS providers typically charge based on factors such as the number of API calls, tokens consumed, model complexity, and data storage. Tracking costs at this level of detail allows for a precise understanding of cost drivers. For MLaaS LLMs, input and output characters/token count can be the key driver of cost. Most models have tokenizer APIs to count the characters/tokens for any given text. These APIs can help understand usage for monitoring and optimizing inference costs. Below is an example of generating a billable character count for Google Cloud’s Gemini model. Python import vertexai from vertexai.generative_models import GenerativeModel def generate_count(project_id: str, location: str) -> str: # Initialize Vertex AI vertexai.init(project=project_id, location=location) # Load the model model = GenerativeModel("gemini-1.0-pro") # prompt tokens count count = model.count_tokens("how many billable characters are here?")) # response total billable characters return count.total_billable_characters generate_count('your-project-id','us-central1') Usage Pattern Analysis and Token Efficiency Analyzing token usage patterns plays a pivotal role in optimizing the operating costs and user experience of GenAI applications. Cloud providers often impose token-per-second quotas, and consistently exceeding these limits can degrade performance. While quota increases may be possible, there are often hard limits. Creative resource management may be required for usage beyond these thresholds. A thorough analysis of token usage over time helps identify avenues for cost optimization. Consider the following strategies: Prompt optimization: Rewriting prompts to reduce their size reduces token consumption and should be a primary focus of optimization efforts. Model tuning: A model fine-tuned on a well-curated dataset can potentially deliver similar or even superior performance with smaller prompts. While some providers charge similar fees for base and tuned models, premium pricing models for tuned models also exist. One needs to be cognizant of these, before making a decision. In certain cases, model tuning can significantly reduce token usage and associated costs. Retrieval-augmented generation: Incorporating information retrieval techniques can help reduce input token size by strategically limiting the data fed into the model, potentially reducing costs. Smaller model utilization: When a smaller model is used in tandem with high-quality data, not only can it achieve comparable performance to a larger model, but it offers a compelling cost-saving strategy too. The token count analysis code example provided earlier in the article can be instrumental in understanding and optimizing token usage. It's worth noting that pricing models for tuned models vary across MLaaS providers, highlighting the importance of careful pricing analysis during the selection process. Latency Monitoring In the context of GenAI applications, latency refers to the total time elapsed between a user submitting a request and receiving a response from the model. Ensuring minimal latency is crucial for maintaining a positive user experience, as delays can significantly degrade perceived responsiveness and overall satisfaction. This section delves into the essential components of robust latency monitoring for GenAI applications. Real-Time Latency Measurement Real-time tracking of end-to-end latency is fundamental. This entails measuring the following components: Network latency: Time taken for data to travel between the user's device and the cloud-based MLaaS service. Model inference time: The actual time required for the LLM to process the input and generate a response. Pre/post-processing overhead: Any additional time consumed for data preparation before model execution and formatting responses for delivery. Impact on User Experience Understanding the correlation between latency and user behavior is essential for optimizing the application. Key user satisfaction metrics to analyze include: Bounce rate: The percentage of users who leave a website or application after viewing a single interaction. Session duration: The length of time a user spends actively engaged with the application. Conversion rates: (When applicable) The proportion of users who complete a desired action, such as a purchase or sign-up. Identifying Bottlenecks Pinpointing the primary sources of latency is crucial for targeted fixes. Potential bottleneck areas warranting investigation include: Network performance: Insufficient bandwidth, slow DNS resolution, or network congestion can significantly increase network latency. Model architecture: Large, complex models may have longer inference times. Many times using smaller models, with higher quality data and better prompts can help yield necessary results. Inefficient input/output processing: Unoptimized data handling, encoding, or formatting can add overhead to the overall process. MLaaS platform factors: Service-side performance fluctuations on the MLaaS platform can impact latency. Proactive latency monitoring is vital for maintaining the responsiveness and user satisfaction of GenAI applications in production environments. By understanding the components of latency, analyzing its impact on user experience, and strategically identifying bottlenecks, developers can make informed decisions to optimize their applications. Scalability Monitoring Production-level deployment of GenAI applications necessitates the ability to handle fluctuations in demand gracefully. Regular load and stress testing are essential for evaluating a system's scalability and resilience under realistic and extreme traffic scenarios. These tests should simulate diverse usage patterns, gradual load increases, peak load simulations, and sustained load. Proactive scalability monitoring is critical, particularly when leveraging MLaaS platforms with hard quota limits for LLMs. This section outlines key metrics and strategies for effective scalability monitoring within these constraints. Autoscaling Configuration Leveraging the autoscaling capabilities provided by MLaaS platforms is crucial for dynamic resource management. Key considerations include: Metrics: Identify the primary metrics that will trigger scaling events (e.g., response time, API requests per second, error rates). Set appropriate thresholds based on performance goals. Scaling policies: Define how quickly resources should be added or removed in response to changes in demand. Consider factors like the time it takes to spin up additional model instances. Cooldown periods: Implement cooldown periods after scaling events to prevent "thrashing" (rapid scaling up and down), which can lead to instability and increased costs. Monitoring Scaling Metrics During scaling events, meticulously monitor these essential metrics: Response time: Ensure that response times remain within acceptable ranges, even when scaling, as latency directly impacts user experience. Throughput: Track the system's overall throughput (e.g., requests per minute) to gauge its capacity to handle incoming requests. Error rates: Monitor for any increases in error rates due to insufficient resources or bottlenecks that can arise during scaling processes. Resource utilization: Observe CPU, memory, and GPU utilization to identify potential resource constraints. MLaaS platforms' hard quota limits pose unique challenges for scaling GenAI applications. Strategies to address this include: Caching: Employ strategic caching of model outputs for frequently requested prompts to reduce the number of model calls. Batching: Consolidate multiple requests and process them in batches to optimize resource usage. Load balancing: Distribute traffic across multiple model instances behind a load balancer to maximize utilization within available quotas. Hybrid deployment: Consider a hybrid approach where less demanding requests are served by MLaaS models, and those exceeding quotas are handled by a self-hosted deployment (assuming the necessary expertise). Proactive application monitoring, encompassing cost, latency, and scalability aspects, underpins the successful deployment and cost-effective operation of GenAI applications in production. By implementing the strategies outlined above, developers and organizations can gain crucial insights, optimize resource usage, and ensure the responsiveness of their applications for enhanced user experiences. Content Monitoring Ensuring the quality and ethical integrity of GenAI applications in production requires a robust content monitoring strategy. This section addresses the detection of hallucinations, accuracy issues, harmful biases, lack of coherence, and the generation of sensitive content. Hallucination Detection Mitigating the tendency of LLMs to generate plausible but incorrect information is paramount for their ethical and reliable deployment in production settings. This section delves into grounding techniques and strategies for leveraging multiple LLMs to enhance the detection of hallucinations. Human-In-The-Loop To address the inherent issue of hallucinations in LLM-based applications, the human-in-the-loop approach offers two key implementation strategies: End-user feedback: Incorporating direct feedback mechanisms, such as thumbs-up/down ratings and options for detailed textual feedback, provides valuable insights into the LLM's output. This data allows for continuous model refinement and pinpoints areas where hallucinations may be prevalent. End-user feedback creates a collaborative loop that can significantly enhance the LLM's accuracy and trustworthiness over time. Human review sampling: Randomly sampling a portion of LLM-generated outputs and subjecting them to rigorous human review establishes a quality control mechanism. Human experts can identify subtle hallucinations, biases, or factual inconsistencies that automated systems might miss. This process is essential for maintaining a high standard of output, particularly in applications where accuracy is paramount. Implementing these HITL strategies fosters a symbiotic relationship between humans and LLMs. It leverages human expertise to guide and correct the LLM, leading to progressively more reliable and factually sound outputs. This approach is particularly crucial in domains where accuracy and the absence of misleading information are of utmost importance. Grounding in First-Party and Trusted Data Anchoring the output of GenAI applications in reliable data sources offers a powerful method for hallucination detection. This approach is essential, especially when dealing with domain-specific content or scenarios where verifiable facts are required. Techniques include: Prompt engineering with factual constraints: Carefully construct prompts that incorporate domain-specific knowledge, reference external data, or explicitly require the model to adhere to a known factual context. For example, a prompt for summarizing a factual document could include instructions like, "Restrict the summary to information explicitly mentioned in the document. Retrieval Augmented Generation: Augment LLMs using trusted datasets that prioritize factual accuracy and adherence to provided information. This can help reduce the model's overall tendency to fabricate information. Incorporating external grounding sources: Utilize APIs or services designed to access and process first-party data, trusted knowledge bases, or real-world information. This allows the system to cross-verify the model's output and flag potential discrepancies. For instance, a financial news summarization task could be coupled with an API that provides up-to-date stock market data for accuracy validation. LLM-based output evaluation: The unique capabilities of LLMs can be harnessed to evaluate the factual consistency of the generated text. Strategies include: Self-consistency check: This can be achieved through multi-step generation, where a task is broken into smaller steps, and later outputs are checked for contradictions against prior ones. For instance, asking the model to first outline key points of a document and then generate a full summary allows for verification that the summary aligns with those key points. Alternatively, rephrasing the original prompt in different formats and comparing the resulting outputs can reveal inconsistencies indicative of fabricated information. Cross-model comparison: Feed the output of one LLM as a prompt into a different LLM with potentially complementary strengths. Analyze any inconsistencies or contradictions between the subsequent outputs, which may reveal hallucinations. Metrics for tracking hallucinations: Accurately measuring and quantifying hallucinations generated by LLMs remains an active area of research. While established metrics from fields such as information retrieval and classification offer a foundation, the unique nature of hallucination detection necessitates the adaptation of existing metrics and the development of novel ones. This section proposes a multi-faceted suite of metrics, including standard metrics creatively adapted for this context as well as novel metrics specifically designed to capture the nuances of hallucinated text. Importantly, I encourage practitioners to tailor these metrics to the specific sensitivities of their business domains. Domain-specific knowledge is essential in crafting a metric set that aligns with the unique requirements of each GenAI deployment. Considerations and Future Directions Specificity vs. Open-Endedness Grounding techniques can be highly effective in tasks requiring factual precision. However, in more creative domains where novelty is expected, strict grounding might hinder the model's ability to generate original ideas. Data Quality The reliability of any grounding strategy depends on the quality and trustworthiness of the external data sources used. Verification against curated first-party data or reputable knowledge bases is essential. Computational Overhead Fact-checking, data retrieval, and multi-model evaluation can introduce additional latency and costs that need careful consideration in production environments. Evolving Evaluation Techniques Research into the use of LLMs for semantic analysis and consistency checking is ongoing. More sophisticated techniques for hallucination detection leveraging LLMs are likely to emerge, further bolstering their utility in this task. Grounding and cross-model evaluation provide powerful tools to combat hallucinations in GenAI outputs. Used strategically, these techniques bolster the factual accuracy and trustworthiness of these applications, promoting their robust deployment in real-world scenarios. Bias Monitoring The issue of bias in LLMs is a complex and pressing concern, as these models have the potential to perpetuate or amplify harmful stereotypes and discriminatory patterns present in their training data. Proactive bias monitoring is crucial for ensuring the ethical and inclusive deployment of GenAI in production. This section explores data-driven, actionable strategies for bias detection and mitigation. Fairness Evaluation Toolkits Specialized libraries and toolkits offer a valuable starting point for bias assessment in LLM outputs. While not all are explicitly designed for LLM evaluation, many can be adapted and repurposed for this context. Consider the following tools: Aequitas: Provides a suite of metrics and visualizations for assessing group fairness and bias across different demographics. This tool can be used to analyze model outputs for disparities based on sensitive attributes like gender, race, etc. ([invalid URL removed]) FairTest: Enables the identification and investigation of potential biases in model outputs. It can analyze the presence of discriminatory language or differential treatment of protected groups. ([invalid URL removed]) Real-Time Analysis In production environments, real-time bias monitoring is essential. Strategies include: Keyword and phrase tracking: Monitor outputs for specific words, phrases, or language patterns historically associated with harmful biases or stereotypes. Tailor these lists to sensitive domains and potential risks related to your application. Dynamic prompting for bias discovery: Systematically test the model with carefully constructed inputs designed to surface potential biases. For example, modify prompts to vary gender, ethnicity, or other attributes while keeping the task consistent, and observe whether the model's output exhibits prejudice. Mitigation Strategies When bias is detected, timely intervention is critical. Consider the following actions: Alerting: Implement an alerting system to flag potentially biased outputs for human review and intervention. Calibrate the sensitivity of these alerts based on the severity of bias and its potential impact. Filtering or modification: In sensitive applications, consider automated filtering of highly biased outputs or modification to neutralize harmful language. These measures must be balanced against the potential for restricting valid and unbiased expressions. Human-in-the-loop: Integrate human moderators for nuanced bias assessment and for determining appropriate mitigation steps. This can include re-prompting the model, providing feedback for fine-tuning, or escalating critical issues. Important Considerations Evolving standards: Bias detection is context-dependent and definitions of harmful speech evolve over time. Monitoring systems must remain adaptable. Intersectionality: Biases can intersect across multiple axes (e.g., race, gender, sexual orientation). Monitoring strategies need to account for this complexity. Bias monitoring in GenAI applications is a multifaceted and ongoing endeavor. By combining specialized toolkits, real-time analysis, and thoughtful mitigation strategies, developers can work towards more inclusive and equitable GenAI systems. Coherence and Logic Assessment Ensuring the internal consistency and logical flow of GenAI output is crucial for maintaining user trust and avoiding nonsensical results. This section offers techniques for unsupervised coherence and logic assessment, applicable to a variety of LLM-based tasks at scale. Semantic Consistency Checks Semantic Similarity Analysis Calculate the semantic similarity between different segments of the generated text (e.g., sentences, paragraphs). Low similarity scores can indicate a lack of thematic cohesion or abrupt changes in topic. Implementation Leverage pre-trained sentence embedding models (e.g., Sentence Transformers) to compute similarity scores between text chunks. Python from sentence_transformers import SentenceTransformer model = SentenceTransformer('paraphrase-distilroberta-base-v2') generated_text = "The company's stock price surged after the earnings report. Cats are excellent pets." sentences = generated_text.split(".") embeddings = model.encode(sentences) similarity_score = cosine_similarity(embeddings[0], embeddings[1]) print(similarity_score) # A low score indicates potential incoherence Topic Modeling Apply topic modeling techniques (e.g., LDA, NMF) to extract latent topics from the generated text. Inconsistent topic distribution across the output may suggest a lack of a central theme or focus. Implementation Utilize libraries like Gensim or scikit-learn for topic modeling. Logical Reasoning Evaluation Entailment and Contradiction Detection Assess whether consecutive sentences within the generated text exhibit logical entailment (one sentence implies the other) or contradiction. This can reveal inconsistencies in reasoning. Implementation Employ entailment models (e.g., BERT-based models fine-tuned on Natural Language Inference datasets like SNLI or MultiNLI). These techniques can be packaged into user-friendly functions or modules, shielding users without deep ML expertise from the underlying complexities. Sensitive Content Detection With GenAI's ability to produce remarkably human-like text, it's essential to be proactive about detecting potentially sensitive content within its outputs. This is necessary to avoid unintended harm, promote responsible use, and maintain trust in the technology. The following section explores modern techniques specifically designed for sensitive content detection within the context of large language models. These scalable approaches will empower users to safeguard the ethical implementation of GenAI across diverse applications. Perspective API integration: Google's Perspective API offers a pre-trained model for identifying toxic comments. It can be integrated into LLM applications to analyze generated text and provide a score for the likelihood of containing toxic content. The Perspective API can be accessed through a REST API. Here's an example using Python: Python from googleapiclient import discovery import json def analyze_text(text): client = discovery.build("commentanalyzer", "v1alpha1") analyze_request = { "comment": {"text": text}, "requestedAttributes": {"TOXICITY": {}, } response = client.comments().analyze(body=analyze_request).execute() return response["attributeScores"]["TOXICITY"]["summaryScore"]["value"] text = "This is a hateful comment." toxicity_score = analyze_text(text) print(f"Toxicity score: {toxicity_score}") The API returns a score between 0 and 1, indicating the likelihood of toxicity. Thresholds can be set to flag or filter content exceeding a certain score. LLM-based safety filter: Major MLaaS providers like Google offer first-party safety filters integrated into their LLM offerings. These filters use internal LLM models trained specifically to detect and mitigate sensitive content. When using Google's Gemini API, the safety filters are automatically applied. You can access different creative text formats with safety guardrails in place. They also provide a second level of safety filters that users can leverage to apply additional filtering based on a set of metrics. For example, Google Cloud’s safety filters are mentioned here. Human-in-the-loop evaluation: Integrating human reviewers in the evaluation process can significantly improve the accuracy of sensitive content detection. Human judgment can help identify nuances and contextual factors that may be missed by automated systems. A platform like Amazon Mechanical Turk can be used to gather human judgments on the flagged content. Evaluator LLM: This involves using a separate LLM (“Evaluator LLM”) specifically to assess the output of the generative LLM for sensitive content. This Evaluator LLM can be trained on a curated dataset labeled for sensitive content. Training an Evaluator LLM requires expertise in deep learning. Open-source libraries like Hugging Face Transformers provide tools and pre-trained models to facilitate this process. An alternative is to use general-purpose LLMs such as Gemini or GPT with appropriate prompts to discover sensitive content. The language used to express sensitive content constantly evolves, requiring continuous updates to the detection models. By combining these scalable techniques and carefully addressing the associated challenges, we can build robust systems for detecting and mitigating sensitive content in LLM outputs, ensuring responsible and ethical deployment of this powerful technology. Conclusion Ensuring the reliable, ethical, and cost-effective deployment of Generative AI applications in production environments requires a multifaceted approach to monitoring. This article presented a framework specifically designed for real-time monitoring of GenAI, addressing both infrastructure and quality considerations. On the infrastructure side, proactive tracking of cost, latency, and scalability is essential. Tools for analyzing token usage, optimizing prompts, and leveraging auto-scaling capabilities play a crucial role in managing operational expenses and maintaining a positive user experience. Content monitoring is equally important for guaranteeing the quality and ethical integrity of GenAI applications. This includes techniques for detecting hallucinations, such as grounding in reliable data sources and incorporating human-in-the-loop verification mechanisms. Strategies for bias mitigation, coherence assessment, and sensitive content detection are vital for promoting inclusivity and preventing harmful outputs. By integrating the monitoring techniques outlined in this article, developers can gain deeper insights into the performance, behavior, and potential risks associated with their GenAI applications. This proactive approach empowers them to take informed corrective actions, optimize resource utilization, and ultimately deliver reliable, trustworthy, and ethical AI-powered experiences to users. While we have focused on MLaaS offerings, the principles discussed can be adapted to self-hosted LLM deployments. The field of GenAI monitoring is rapidly evolving. Researchers and practitioners should remain vigilant regarding new developments in hallucination detection, bias mitigation, and evaluation techniques. Additionally, it's crucial to recognize the ongoing debate around the balance between accuracy restrictions and creativity in generative models. Reference M. Korolov, “For IT leaders, operationalized gen AI is still a moving target,” CIO, Feb. 28, 2024. O. Simeone, "A Very Brief Introduction to Machine Learning With Applications to Communication Systems," in IEEE Transactions on Cognitive Communications and Networking, vol. 4, no. 4, pp. 648-664, Dec. 2018, doi: 10.1109/TCCN.2018.2881441. F. Doshi-Velez and B. Kim, "Towards A Rigorous Science of Interpretable Machine Learning", arXiv, 2017. [Online]. A. B. Arrieta et al. "Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI." Information Fusion 58 (2020): 82-115. A. Saleiro et al. "Aequitas: A Bias and Fairness Audit Toolkit." arXiv, 2018. [Online]. E. Bender and A. Koller, “Climbing towards NLU: On Meaning, Form, and Understanding in the Age of Data,” Proceedings of the 58th Annual Meeting of the Association for Computational S. Mousavi et al., “Enhancing Large Language Models with Ensemble of Critics for Mitigating Toxicity and Hallucination,” OpenReview. X. Amatriain, “Measuring And Mitigating Hallucinations In Large Language Models: A Multifaceted Approach”, Mar. 2024. [Online].

By Amit Rai

Integrating Salesforce APEX REST

In the realm of modern enterprise integration, the harmonious interaction between different systems and platforms is paramount. Salesforce, being one of the leading CRM platforms, often necessitates seamless integration with other systems to leverage its data and functionalities. MuleSoft, on the other hand, is a powerful integration platform that facilitates connecting various applications and APIs. One common scenario in Salesforce integration is invoking APEX REST methods from MuleSoft to interact with Salesforce data. In this blog post, we'll walk through the process of invoking an APEX REST method in MuleSoft and fetching account information from Salesforce using APEX code snippets. Why Use APEX REST Methods? Salesforce APEX REST methods provide a flexible way to expose custom functionalities or access Salesforce data through RESTful APIs. Leveraging APEX REST methods allows for fine-grained control over what data is exposed and how it's accessed, making it a preferred choice for integrating Salesforce with external systems like MuleSoft. Prerequisites Before we dive into the integration process, ensure you have the following prerequisites in place: Salesforce developer account: You'll need a Salesforce Developer account to create and test APEX REST methods. MuleSoft Anypoint platform account: Sign up for MuleSoft Anypoint Platform to create Mule applications. Integration Steps 1. Create APEX REST Method in Salesforce First, let's create an APEX REST method in Salesforce to fetch account information. Here's a simple example: Java apex @RestResource(urlMapping='/accountInfo/*') global with sharing class AccountInfoRestController { @HttpGet global static Account getAccountInfoById() { RestRequest req = RestContext.request; String accountId = req.requestURI.substring(req.requestURI.lastIndexOf('/')+1); // Fetch Account information by Id Account acc = [SELECT Id, Name, Industry, Phone, BillingCity FROM Account WHERE Id=:accountId]; return acc; } } In this code snippet, we define an APEX class AccountInfoRestController annotated with @RestResource to expose a REST endpoint /accountInfo/. The getAccountInfoById() method retrieves account information based on the provided Account ID. 2. Deploy APEX Class in Salesforce Deploy the AccountInfoRestController class to your Salesforce organization. 3. Create MuleSoft Application Create a new MuleSoft application in Anypoint Studio. File --> New --> Mule Project 4. Configure HTTP Connector Add an HTTP Listener to your Mule flow to receive incoming HTTP requests. Configure the listener with the appropriate host and port. 5. Invoke the APEX REST Method in MuleSoft There are 2 ways to invoke the APEX REST method in MuleSoft. 5.1. Using HTTP Requester Add an HTTP Request component after the HTTP Listener and configure it to make a GET request to the Salesforce APEX REST endpoint: URL: Specify the URL of the APEX REST method in the format https://<Salesforce_Instance_URL>/services/apexrest/accountInfo/<Account_Id>. Method: Set the method to GET. Headers: Include any required headers, such as authentication tokens. 5.2. Using Invoke APEX Rest API Connector Add an Invoke APEX Rest API Connector after the HTTP listener and configure it to make a GET request to the Salesforce APEX REST endpoint: APEX class: Specify the name of the class AccountInfoRestController APEX method: This is the combination of 4 attributes: methodName^urlMapping^httpMethod^returnType, getAccountInfoById^/accountInfo^HttpGet^Account Salesforce configuration: Basic Authentication Username: Your login user ID Password: Your login password Security Token: Generate from Profile --> Settings --> Reset My Security Token 6. Parse Response After invoking the APEX REST method, parse the response body to extract account information. Use transform message to send payload in required structure to calling flow. JSON { "Id": "xxxxxxxxxxxxxxx", "Name": "Example Account", "Industry": "Technology", "Phone": "(555) 555-5555", "BillingCity": "San Francisco" } 7. Handle Errors and Transform Data Implement error handling and data transformation as per your application requirements. 8. Complete the Mule Flow Complete the Mule flow with the necessary components for further processing or response generation. Conclusion Integrating Salesforce APEX REST methods in MuleSoft enables seamless interaction between Salesforce and other systems. By following the steps outlined in this guide, you can effectively invoke APEX REST methods from MuleSoft and retrieve account information from Salesforce. This integration approach empowers organizations to unlock the full potential of their Salesforce data within their broader application ecosystem.

By Karan Gupta

澳洲幸运10官网开奖168体彩 Harnessing the Power of Observability in Kubernetes With OpenTelemetry

In today's dynamic and complex cloud environments, observability has become a cornerstone for maintaining the reliability, performance, and security of applications. Kubernetes, the de facto standard for container orchestration, hosts a plethora of applications, making the need for an efficient and scalable observability framework paramount. This article delves into how OpenTelemetry, an open-source observability framework, can be seamlessly integrated into a Kubernetes (K8s) cluster managed by KIND (Kubernetes IN Docker), and how tools like Loki, Tempo, and the kube-prometheus-stack can enhance your observability strategy. We'll explore this setup through the lens of a practical example, utilizing custom values from a specific GitHub repository. The Observability Landscape in Kubernetes Before diving into the integration, let's understand the components at play: KIND offers a straightforward way to run K8s clusters within Docker containers, ideal for development and testing. Loki is a horizontally scalable, highly available, multi-tenant log aggregation system inspired by Prometheus. Tempo is a high-volume, minimal-dependency trace aggregator, providing a robust way to store and query distributed traces. kube-prometheus-stack bundles Prometheus together with Grafana and other tools to provide a comprehensive monitoring solution out-of-the-box. OpenTelemetry Operator simplifies the deployment and management of OpenTelemetry collectors in K8s environments. Promtail is responsible for gathering logs and sending them to Loki. Integrating these components within a K8s cluster orchestrated by KIND not only streamlines the observability but also leverages the strengths of each tool, creating a cohesive and powerful monitoring solution. Setting up Your Kubernetes Cluster With KIND Firstly, ensure you have KIND installed on your machine. If not, you can easily install it using the following command: Shell curl -Lo ./kind https://kind.sigs.k8s.io/dl/v0.11.1/kind-$(uname)-amd64 chmod +x ./kind mv ./kind /usr/local/bin/kind Once KIND is installed, you can create a cluster by running: Shell kind create cluster --config kind-config.yaml kubectl create ns observability kubectl config set-context --current --namespace observability kind-config.yaml should be tailored to your specific requirements. It's important to ensure your cluster has the necessary resources (CPU, memory) to support the observability tools you plan to deploy. Deploying Observability Tools With HELM HELM, the package manager for Kubernetes, simplifies the deployment of applications. Here's how you can install Loki, Tempo, and the kube-prometheus-stack using HELM: Add the necessary HELM repositories: helm repo add grafana https://grafana.github.io/helm-charts helm repo add prometheus-community https://prometheus-community.github.io/helm-charts helm repo update Install Loki, Tempo, and kube-prometheus-stack: For each tool, we'll use a custom values file available in the provided GitHub repository. This ensures a tailored setup aligned with specific monitoring and tracing needs. Loki: helm upgrade --install loki grafana/loki --values https://raw.githubusercontent.com/brainupgrade-in/kubernetes/main/observability/opentelemetry/01-loki-values.yaml Tempo: helm install tempo grafana/tempo --values https://raw.githubusercontent.com/brainupgrade-in/kubernetes/main/observability/opentelemetry/02-tempo-values.yaml kube-prometheus-stack: helm install kube-prometheus-stack prometheus-community/kube-prometheus-stack --values https://raw.githubusercontent.com/brainupgrade-in/kubernetes/main/observability/opentelemetry/03-grafana-helm-values.yaml Install OpenTelemetry Operator and Promtail: The OpenTelemetry Operator and Promtail can also be installed via HELM, further streamlining the setup process. OpenTelemetry Operator: helm install opentelemetry-operator open-telemetry/opentelemetry-operator Promtail: helm install promtail grafana/promtail --set "loki.serviceName=loki.observability.svc.cluster.local" Configuring OpenTelemetry for Optimal Observability Once the OpenTelemetry Operator is installed, you'll need to configure it to collect metrics, logs, and traces from your applications. OpenTelemetry provides a unified way to send observability data to various backends like Loki for logs, Prometheus for metrics, and Tempo for traces. A sample OpenTelemetry Collector configuration might look like this: YAML apiVersion: opentelemetry.io/v1alpha1 kind: OpenTelemetryCollector metadata: name: otel namespace: observability spec: config: | receivers: filelog: include: ["/var/log/containers/*.log"] otlp: protocols: grpc: endpoint: 0.0.0.0:4317 http: endpoint: 0.0.0.0:4318 processors: memory_limiter: check_interval: 1s limit_percentage: 75 spike_limit_percentage: 15 batch: send_batch_size: 1000 timeout: 10s exporters: # NOTE: Prior to v0.86.0 use `logging` instead of `debug`. debug: prometheusremotewrite: endpoint: "http://prometheus-kube-prometheus-prometheus.observability:9090/api/v1/write" loki: endpoint: "http://loki.observability:3100/loki/api/v1/push" otlp: endpoint: http://tempo.observability.svc.cluster.local:4317 retry_on_failure: enabled: true tls: insecure: true service: pipelines: traces: receivers: [otlp] processors: [memory_limiter, batch] exporters: [debug,otlp] metrics: receivers: [otlp] processors: [memory_limiter, batch] exporters: [debug,prometheusremotewrite] logs: receivers: [otlp] processors: [memory_limiter, batch] exporters: [debug,loki] mode: daemonset This configuration sets up the collector to receive data via the OTLP protocol, process it in batches, and export it to the appropriate backends. To enable auto-instrumentation for java apps, you can define the following. YAML apiVersion: opentelemetry.io/v1alpha1 kind: Instrumentation metadata: name: java-instrumentation namespace: observability spec: exporter: endpoint: http://otel-collector.observability:4317 propagators: - tracecontext - baggage sampler: type: always_on argument: "1" java: env: - name: OTEL_EXPORTER_OTLP_ENDPOINT value: http://otel-collector.observability:4317 Leveraging Observability Data for Insights With the observability tools in place, you can now leverage the collected data to gain actionable insights into your application's performance, reliability, and security. Grafana can be used to visualize metrics and logs, while Tempo allows you to trace distributed transactions across microservices. Visualizing Data With Grafana Grafana offers a powerful platform for creating dashboards that visualize the metrics and logs collected by Prometheus and Loki, respectively. You can create custom dashboards or import existing ones tailored to Kubernetes monitoring. Tracing With Tempo Tempo, integrated with OpenTelemetry, provides a detailed view of traces across microservices, helping you pinpoint the root cause of issues and optimize performance. Illustrating Observability With a Weather Application Example To bring the concepts of observability to life, let's walk through a practical example using a simple weather application deployed in our Kubernetes cluster. This application, structured around microservices, showcases how OpenTelemetry can be utilized to gather crucial metrics, logs, and traces. The configuration for this demonstration is based on a sample Kubernetes deployment found here. Deploying the Weather Application Our weather application is a microservice that fetches weather data. It's a perfect candidate to illustrate how OpenTelemetry captures and forwards telemetry data to our observability stack. Here's a partial snippet of the deployment configuration. Full YAML is found here. YAML apiVersion: apps/v1 kind: Deployment metadata: labels: app: weather tier: front name: weather-front spec: replicas: 1 selector: matchLabels: app: weather tier: front template: metadata: labels: app: weather tier: front app.kubernetes.io/name: weather-front annotations: prometheus.io/scrape: "true" prometheus.io/port: "8888" prometheus.io/path: /actuator/prometheus instrumentation.opentelemetry.io/inject-java: "true" # sidecar.opentelemetry.io/inject: 'true' instrumentation.opentelemetry.io/container-names: "weather-front" spec: containers: - image: brainupgrade/weather:metrics imagePullPolicy: Always name: weather-front resources: limits: cpu: 1000m memory: 2048Mi requests: cpu: 100m memory: 1500Mi env: - name: APP_NAME valueFrom: fieldRef: apiVersion: v1 fieldPath: metadata.labels['app.kubernetes.io/name'] - name: NAMESPACE valueFrom: fieldRef: apiVersion: v1 fieldPath: metadata.namespace - name: OTEL_SERVICE_NAME value: $(NAMESPACE)-$(APP_NAME) - name: spring.application.name value: $(NAMESPACE)-$(APP_NAME) - name: spring.datasource.url valueFrom: configMapKeyRef: name: app-config key: spring.datasource.url - name: spring.datasource.username valueFrom: secretKeyRef: name: app-secret key: spring.datasource.username - name: spring.datasource.password valueFrom: secretKeyRef: name: app-secret key: spring.datasource.password - name: weatherServiceURL valueFrom: configMapKeyRef: name: app-config key: weatherServiceURL - name: management.endpoints.web.exposure.include value: "*" - name: management.server.port value: "8888" - name: management.metrics.web.server.request.autotime.enabled value: "true" - name: management.metrics.tags.application value: $(NAMESPACE)-$(APP_NAME) - name: otel.instrumentation.log4j.capture-logs value: "true" - name: otel.logs.exporter value: "otlp" ports: - containerPort: 8080 This deployment configures the weather service with OpenTelemetry's OTLP (OpenTelemetry Protocol) exporter, directing telemetry data to our OpenTelemetry Collector. It also labels the service for clear identification within our telemetry data. Visualizing Observability Data Once deployed, the weather service starts sending metrics, logs, and traces to our observability tools. Here's how you can leverage this data. Trace the request across services using Tempo datasource Metrics: Prometheus, part of the kube-prometheus-stack, collects metrics on the number of requests, response times, and error rates. These metrics can be visualized in Grafana to monitor the health and performance of the weather service. For example, grafana dashboard (ID 17175) can be used to view Observability for Spring boot apps Logs: Logs generated by the weather service are collected by Promtail and stored in Loki. Grafana can query these logs, allowing you to search and visualize operational data. This is invaluable for debugging issues, such as understanding the cause of an unexpected spike in error rates. Traces: Traces captured by OpenTelemetry and stored in Tempo provide insight into the request flow through the weather service. This is crucial for identifying bottlenecks or failures in the service's operations. Gaining Insights With the weather application up and running, and observability data flowing, we can start to gain actionable insights: Performance optimization: By analyzing response times and error rates, we can identify slow endpoints or errors in the weather service, directing our optimization efforts more effectively. Troubleshooting: Logs and traces help us troubleshoot issues by providing context around errors or unexpected behavior, reducing the time to resolution. Scalability decisions: Metrics on request volumes and resource utilization guide decisions on when to scale the service to handle load more efficiently. This weather service example underscores the power of OpenTelemetry in a Kubernetes environment, offering a window into the operational aspects of applications. By integrating observability into the development and deployment pipeline, teams can ensure their applications are performant, reliable, and scalable. This practical example of a weather application illustrates the tangible benefits of implementing a comprehensive observability strategy with OpenTelemetry. It showcases how seamlessly metrics, logs, and traces can be collected, analyzed, and visualized, providing developers and operators with the insights needed to maintain and improve complex cloud-native applications. Conclusion Integrating OpenTelemetry with Kubernetes using tools like Loki, Tempo, and the kube-prometheus-stack offers a robust solution for observability. This setup not only simplifies the deployment and management of these tools but also provides a comprehensive view of your application's health, performance, and security. With the actionable insights gained from this observability stack, teams can proactively address issues, improve system reliability, and enhance the user experience. Remember, the key to successful observability lies in the strategic implementation and continuous refinement of your monitoring setup. Happy observability!

By Rajesh Gheware

Distributed Caching: Enhancing Performance in Modern Applications

In an era where instant access to data is not just a luxury but a necessity, distributed caching has emerged as a pivotal technology in optimizing application performance. With the exponential growth of data and the demand for real-time processing, traditional methods of data storage and retrieval are proving inadequate. This is where distributed caching comes into play, offering a scalable, efficient, and faster way of handling data across various networked resources. Understanding Distributed Caching What Is Distributed Caching? Distributed caching refers to a method where information is stored across multiple servers, typically spread across various geographical locations. This approach ensures that data is closer to the user, reducing access time significantly compared to centralized databases. The primary goal of distributed caching is to enhance speed and reduce the load on primary data stores, thereby improving application performance and user experience. Key Components Cache store: At its core, the distributed cache relies on the cache store, where data is kept in-memory across multiple nodes. This arrangement ensures swift data retrieval and resilience to node failures. Cache engine: This engine orchestrates the operations of storing and retrieving data. It manages data partitioning for balanced distribution across nodes and load balancing to maintain performance during varying traffic conditions. Cache invalidation mechanism: A critical aspect that keeps the cache data consistent with the source database. Techniques such as time-to-live (TTL), write-through, and write-behind caching are used to ensure timely updates and data accuracy. Replication and failover processes: These processes provide high availability. They enable the cache system to maintain continuous operation, even in the event of node failures or network issues, by replicating data and providing backup nodes. Security and access control: Integral to protecting the cached data, these mechanisms safeguard against unauthorized access and ensure the integrity and confidentiality of data within the cache. Why Distributed Caching? Distributed caching is a game-changer in the realm of modern applications, offering distinct advantages that ensure efficient, scalable, and reliable software solutions. Speed and performance: Think of distributed caching as having express checkout lanes in a grocery store. Just as these lanes speed up the shopping experience, distributed caching accelerates data retrieval by storing frequently accessed data in memory. This results in noticeably faster and more responsive applications, especially important for dynamic platforms like e-commerce sites, real-time analytics tools, and interactive online games. Scaling with ease: As your application grows and attracts more users, it's like a store becoming more popular. You need more checkout lanes (or in this case, cache nodes) to handle the increased traffic. Distributed caching makes adding these extra lanes simple, maintaining smooth performance no matter how busy things get. Always up, always available: Imagine if one express lane closes unexpectedly – in a well-designed store, this isn’t a big deal because there are several others open. Similarly, distributed caching replicates data across various nodes. So, if one node goes down, the others take over without any disruption, ensuring your application remains up and running at all times. Saving on costs: Finally, using distributed caching is like smartly managing your store’s resources. It reduces the load on your main databases (akin to not overstaffing every lane) and, as a result, lowers operational costs. This efficient use of resources means your application does more with less, optimizing performance without needing excessive investment in infrastructure. How Distributed Caching Works Imagine you’re in a large library with lots of books (data). Every time you need a book, you must ask the librarian (the main database), who then searches through the entire library to find it. This process can be slow, especially if many people are asking for books at the same time. Now, enter distributed caching. Creating a mini-library (cache modes): In our library, we set up several small bookshelves (cache nodes) around the room. These mini-libraries store copies of the most popular books (frequently accessed data). So, when you want one of these books, you just grab it from the closest bookshelf, which is much faster than waiting for the librarian. Keeping the mini-libraries updated (cache invalidation): To ensure that the mini-libraries have the latest versions of the books, we have a system. Whenever a new edition comes out, or a book is updated, the librarian makes sure that these changes are reflected in the copies stored on the mini bookshelves. This way, you always get the most current information. Expanding the library (scalability): As more people come to the library, we can easily add more mini bookshelves or put more copies of popular books on existing shelves. This is like scaling the distributed cache — we can add more cache nodes or increase their capacity, ensuring everyone gets their books quickly, even when the library is crowded. Always open (high availability): What if one of the mini bookshelves is out of order (a node fails)? Well, there are other mini bookshelves with the same books, so you can still get what you need. This is how distributed caching ensures that data is always available, even if one part of the system goes down. In essence, distributed caching works by creating multiple quick-access points for frequently needed data, making it much faster to retrieve. It’s like having speedy express lanes in a large library, ensuring that you get your book quickly, the library runs efficiently, and everybody leaves happy. Caching Strategies Distributed caching strategies are like different methods used in a busy restaurant to ensure customers get their meals quickly and efficiently. Here’s how these strategies work in a simplified manner: Cache-aside (lazy loading): Imagine a waiter who only prepares a dish when a customer orders it. Once cooked, he keeps a copy in the kitchen for any future orders. In caching, this is like loading data into the cache only when it’s requested. It ensures that only necessary data is cached, but the first request might be slower as the data is not preloaded. Write-through caching: This is like a chef who prepares a new dish and immediately stores its recipe in a quick-reference guide. Whenever that dish is ordered, the chef can quickly recreate it using the guide. In caching, data is saved in the cache and the database simultaneously. This method ensures data consistency but might be slower for write operations. Write-around caching: Consider this as a variation of the write-through method. Here, when a new dish is created, the recipe isn’t immediately put into the quick-reference guide. It’s added only when it’s ordered again. In caching, data is written directly to the database and only written to the cache if it's requested again. This reduces the cache being filled with infrequently used data but might make the first read slower. Write-back caching: Imagine the chef writes down new recipes in the quick-reference guide first and updates the main recipe book later when there’s more time. In caching, data is first written to the cache and then, after some delay, written to the database. This speeds up write operations but carries a risk if the cache fails before the data is saved to the database. Each of these strategies has its pros and cons, much like different techniques in a restaurant kitchen. The choice depends on what’s more important for the application – speed, data freshness, or consistency. It's all about finding the right balance to serve up the data just the way it's needed! Consistency Models Understanding distributed caching consistency models can be simplified by comparing them to different methods of updating news on various bulletin boards across a college campus. Each bulletin board represents a cache node, and the news is the data you're caching. Strong consistency: This is like having an instant update on all bulletin boards as soon as a new piece of news comes in. Every time you check any board, you're guaranteed to see the latest news. In distributed caching, strong consistency ensures that all nodes show the latest data immediately after it's updated. It's great for accuracy but can be slower because you have to wait for all boards to be updated before continuing. Eventual consistency: Imagine that new news is first posted on the main bulletin board and then, over time, copied to other boards around the campus. If you check a board immediately after an update, you might not see the latest news, but give it a little time, and all boards will show the same information. Eventual consistency in distributed caching means that all nodes will eventually hold the same data, but there might be a short delay. It’s faster but allows for a brief period where different nodes might show slightly outdated information. Weak consistency: This is like having updates made to different bulletin boards at different times without a strict schedule. If you check different boards, you might find varying versions of the news. In weak consistency for distributed caching, there's no guarantee that all nodes will be updated at the same time, or ever fully synchronized. This model is the fastest, as it doesn't wait for updates to propagate to all nodes, but it's less reliable for getting the latest data. Read-through and write-through caching: These methods can be thought of as always checking or updating the main news board (the central database) when getting or posting news. In read-through caching, every time you read data, it checks with the main database to ensure it's up-to-date. In write-through caching, every time you update data, it updates the main database first before the bulletin boards. These methods ensure consistency between the cache and the central database but can be slower due to the constant checks or updates. Each of these models offers a different balance between ensuring data is up-to-date across all nodes and the speed at which data can be accessed or updated. The choice depends on the specific needs and priorities of your application. Use Cases E-Commerce Platforms Normal caching: Imagine a small boutique with a single counter for popular items. This helps a bit, as customers can quickly grab what they frequently buy. But when there's a big sale, the counter gets overcrowded, and people wait longer. Distributed caching: Now think of a large department store with multiple counters (nodes) for popular items, scattered throughout. During sales, customers can quickly find what they need from any nearby counter, avoiding long queues. This setup is excellent for handling heavy traffic and large, diverse inventories, typical in e-commerce platforms. Online Gaming Normal caching: It’s like having one scoreboard in a small gaming arcade. Players can quickly see scores, but if too many players join, updating and checking scores becomes slow. Distributed caching: In a large gaming complex with scoreboards (cache nodes) in every section, players anywhere can instantly see updates. This is crucial for online gaming, where real-time data (like player scores or game states) needs fast, consistent updates across the globe. Real-Time Analytics Normal caching: It's similar to having a single newsstand that quickly provides updates on certain topics. It's faster than searching through a library but can get overwhelming during peak news times. Distributed caching: Picture a network of digital screens (cache nodes) across a city, each updating in real-time with news. For applications analyzing live data (like financial trends or social media sentiment), this means getting instant insights from vast, continually updated data sources. Choosing the Right Distributed Caching Solution When selecting a distributed caching solution, consider the following: Performance and latency: Assess the solution's ability to handle your application’s load, especially under peak usage. Consider its read/write speed, latency, and how well it maintains performance consistency. This factor is crucial for applications requiring real-time responsiveness. Scalability and flexibility: Ensure the solution can horizontally scale as your user base and data volume grow. The system should allow for easy addition or removal of nodes with minimal impact on ongoing operations. Scalability is essential for adapting to changing demands. Data consistency and reliability: Choose a consistency model (strong, eventual, etc.) that aligns with your application's needs. Also, consider how the system handles node failures and data replication. Reliable data access and accuracy are vital for maintaining user trust and application integrity. Security features: Given the sensitive nature of data today, ensure the caching solution has robust security features, including authentication, authorization, and data encryption. This is especially important if you're handling personal or sensitive user data. Cost and total ownership: Evaluate the total cost of ownership, including licensing, infrastructure, and maintenance. Open-source solutions might offer cost savings but consider the need for internal expertise. Balancing cost with features and long-term scalability is key for a sustainable solution. Implementing Distributed Caching Implementing distributed caching effectively requires a strategic approach, especially when transitioning from normal (single-node) caching. Here’s a concise guide: Assessment and Planning Normal caching: Typically involves setting up a single cache server, often co-located with the application server. Distributed caching: Start with a thorough assessment of your application’s performance bottlenecks and data access patterns. Plan for multiple cache nodes, distributed across different servers or locations, to handle higher loads and ensure redundancy. Choosing the Right Technology Normal caching: Solutions like Redis or Memcached can be sufficient for single-node caching. Distributed caching: Select a distributed caching technology that aligns with your scalability, performance, and consistency needs. Redis Cluster, Apache Ignite, or Hazelcast are popular choices. Configuration and Deployment Normal caching: Configuration is relatively straightforward, focusing mainly on the memory allocation and cache eviction policies. Distributed caching: Requires careful configuration of data partitioning, replication strategies, and node discovery mechanisms. Ensure cache nodes are optimally distributed to balance load and minimize latency. Data Invalidation and Synchronization Normal caching: Less complex, often relying on TTL (time-to-live) settings for data invalidation. Distributed caching: Implement more sophisticated invalidation strategies like write-through or write-behind caching. Ensure synchronization mechanisms are in place for data consistency across nodes. Monitoring and Maintenance Normal caching: Involves standard monitoring of cache hit rates and memory usage. Distributed caching: Requires more advanced monitoring of individual nodes, network latency between nodes, and overall system health. Set up automated scaling and failover processes for high availability. Security Measures Normal caching: Basic security configurations might suffice. Distributed caching: Implement robust security protocols, including encryption in transit and at rest, and access controls. Challenges and Best Practices Challenges Cache invalidation: Ensuring that cached data is updated or invalidated when the underlying data changes. Data synchronization: Keeping data synchronized across multiple cache nodes. Best Practices Regularly monitor cache performance: Use monitoring tools to track hit-and-miss ratios and adjust strategies accordingly. Implement robust cache invalidation mechanisms: Use techniques like time-to-live (TTL) or explicit invalidation. Plan for failover and recovery: Ensure that your caching solution can handle node failures gracefully. Conclusion Distributed caching is an essential component in the architectural landscape of modern applications, especially those requiring high performance and scalability. By understanding the fundamentals, evaluating your needs, and following best practices, you can harness the power of distributed caching to elevate your application's performance, reliability, and user experience. As technology continues to evolve, distributed caching will play an increasingly vital role in managing the growing demands for fast and efficient data access.

By Amardeep Singh

Service Mesh Unleashed: A Riveting Dive Into the Istio Framework

This article presents an in-depth analysis of the service mesh landscape, focusing specifically on Istio, one of the most popular service mesh frameworks. A service mesh is a dedicated infrastructure layer for managing service-to-service communication in the world of microservices. Istio, built to seamlessly integrate with platforms like Kubernetes, provides a robust way to connect, secure, control, and observe services. This journal explores Istio’s architecture, its key features, and the value it provides in managing microservices at scale. Service Mesh A Kubernetes service mesh is a tool that improves the security, monitoring, and reliability of applications on Kubernetes. It manages communication between microservices and simplifies the complex network environment. By deploying network proxies alongside application code, the service mesh controls the data plane. This combination of Kubernetes and service mesh is particularly beneficial for cloud-native applications with many services and instances. The service mesh ensures reliable and secure communication, allowing developers to focus on core application development. A Kubernetes service mesh, like any service mesh, simplifies how distributed applications communicate with each other. It acts as a layer of infrastructure that manages and controls this communication, abstracting away the complexity from individual services. Just like a tracking and routing service for packages, a Kubernetes service mesh tracks and directs traffic based on rules to ensure reliable and efficient communication between services. A service mesh consists of a data plane and a control plane. The data plane includes lightweight proxies deployed alongside application code, handling the actual service-to-service communication. The control plane configures these proxies, manages policies, and provides additional capabilities such as tracing and metrics collection. With a Kubernetes service mesh, developers can separate their application's logic from the infrastructure that handles security and observability, enabling secure and monitored communication between microservices. It also supports advanced deployment strategies and integrates with monitoring tools for better operational control. Istio as a Service Mesh Istio is a popular open-source service mesh that has gained significant adoption among major tech companies like Google, IBM, and Lyft. It leverages the data plane and control plane architecture common to all service meshes, with its data plane consisting of envoy proxies deployed as sidecars within Kubernetes pods. The data plane in Istio is responsible for managing traffic, implementing fault injection for specific protocols, and providing application layer load balancing. This application layer load balancing differs from the transport layer load balancing in Kubernetes. Additionally, Istio includes components for collecting metrics, enforcing access control, authentication, and authorization, as well as integrating with monitoring and logging systems. It also supports encryption, authentication policies, and role-based access control through features like TLS authentication. Find the Istio architecture diagram below: Below, find the configuration and data flow diagram of Istio: Furthermore, Istio can be extended with various tools to enhance its functionality and integrate with other systems. This allows users to customize and expand the capabilities of their Istio service mesh based on their specific requirements. Traffic Management Istio offers traffic routing features that have a significant impact on performance and facilitate effective deployment strategies. These features allow precise control over the flow of traffic and API calls within a single cluster and across clusters. Within a single cluster, Istio's traffic routing rules enable efficient distribution of requests between services based on factors like load balancing algorithms, service versions, or user-defined rules. This ensures optimal performance by evenly distributing requests and dynamically adjusting routing based on service health and availability. Routing traffic across clusters enhances scalability and fault tolerance. Istio provides configuration options for traffic routing across clusters, including round-robin, least connections, or custom rules. This capability allows traffic to be directed to different clusters based on factors such as network proximity, resource utilization, or specific business requirements. In addition to performance optimization, Istio's traffic routing rules support advanced deployment strategies. A/B testing enables the routing of a certain percentage of traffic to a new service version while serving the majority of traffic to the existing version. Canary deployments involve gradually shifting traffic from an old version to a new version, allowing for monitoring and potential rollbacks. Staged rollouts incrementally increase traffic to a new version, enabling precise control and monitoring of the deployment process. Furthermore, Istio simplifies the configuration of service-level properties like circuit breakers, timeouts, and retries. Circuit breakers prevent cascading failures by redirecting traffic when a specified error threshold is reached. Timeouts and retries handle network delays or transient failures by defining response waiting times and the number of request retries. In summary, Istio's traffic routing capabilities provide a flexible and powerful means to control traffic and API calls, improving performance and facilitating advanced deployment strategies such as A/B testing, canary deployments, and staged rollouts. The following is a code sample that demonstrates how to use Istio's traffic routing features in Kubernetes using Istio VirtualService and DestinationRule resources: In the code below, we define a VirtualService named my-service with a host my-service.example.com. We configure traffic routing by specifying two routes: one to the v1 subset of the my-service destination and another to the v2 subset. We assign different weights to each route to control the proportion of traffic they receive. The DestinationRule resource defines subsets for the my-service destination, allowing us to route traffic to different versions of the service based on labels. In this example, we have subsets for versions v1 and v2. Code Sample YAML # Example VirtualService configuration apiVersion: networking.istio.io/v1alpha3 kind: VirtualService metadata: name: my-service spec: hosts: - my-service.example.com http: - route: - destination: host: my-service subset: v1 weight: 90 - destination: host: my-service subset: v2 weight: 10 # Example DestinationRule configuration apiVersion: networking.istio.io/v1alpha3 kind: DestinationRule metadata: name: my-service spec: host: my-service subsets: - name: v1 labels: version: v1 - name: v2 labels: version: v2 Observability As the complexity of services grows, it becomes increasingly challenging to comprehend their behavior and performance. Istio addresses this challenge by automatically generating detailed telemetry for all communications within a service mesh. This telemetry includes metrics, distributed traces, and access logs, providing comprehensive observability into the behavior of services. With Istio, operators can easily access and analyze metrics that capture various aspects of service performance, such as request rates, latency, and error rates. These metrics offer valuable insights into the health and efficiency of services, allowing operators to proactively identify and address performance issues. Distributed tracing in Istio enables the capturing and correlation of trace spans across multiple services involved in a request. This provides a holistic view of the entire request flow, allowing operators to understand the latency and dependencies between services. With this information, operators can pinpoint bottlenecks and optimize the performance of their applications. Full access logs provided by Istio capture detailed information about each request, including headers, payloads, and response codes. These logs offer a comprehensive audit trail of service interactions, enabling operators to investigate issues, debug problems, and ensure compliance with security and regulatory requirements. The telemetry generated by Istio is instrumental in empowering operators to troubleshoot, maintain, and optimize their applications. It provides a deep understanding of how services interact, allowing operators to make data-driven decisions and take proactive measures to improve performance and reliability. Furthermore, Istio's telemetry capabilities are seamlessly integrated into the service mesh without requiring any modifications to the application code, making it a powerful and convenient tool for observability. Istio automatically generates telemetry for all communications within a service mesh, including metrics, distributed traces, and access logs. Here's an example of how you can access metrics and logs using Istio: Commands in Bash # Access metrics: istioctl dashboard kiali # Access distributed traces: istioctl dashboard jaeger # Access access logs: kubectl logs -l istio=ingressgateway -n istio-system In the code above, we use the istioctl command-line tool to access Istio's observability dashboards. The istioctl dashboard kiali command opens the Kiali dashboard, which provides a visual representation of the service mesh and allows you to view metrics such as request rates, latency, and error rates. The istioctl dashboard jaeger command opens the Jaeger dashboard, which allows you to view distributed traces and analyze the latency and dependencies between services. To access access logs, we use the kubectl logs command to retrieve logs from the Istio Ingress Gateway. By filtering logs with the label istio=ingressgateway and specifying the namespace istio-system, we can view detailed information about each request, including headers, payloads, and response codes. By leveraging these observability features provided by Istio, operators can gain deep insights into the behavior and performance of their services. This allows them to troubleshoot issues, optimize performance, and ensure the reliability of their applications. Security Capabilities Microservices have specific security requirements, such as protecting against man-in-the-middle attacks, implementing flexible access controls, and enabling auditing tools. Istio addresses these needs with its comprehensive security solution. Istio's security model follows a "security-by-default" approach, providing in-depth defense for deploying secure applications across untrusted networks. It ensures strong identity management, authenticating and authorizing services within the service mesh to prevent unauthorized access and enhance security. Transparent TLS encryption is a crucial component of Istio's security framework. It encrypts all communication within the service mesh, safeguarding data from eavesdropping and tampering. Istio manages certificate rotation automatically, simplifying the maintenance of a secure communication channel between services. Istio also offers powerful policy enforcement capabilities, allowing operators to define fine-grained access controls and policies for service communication. These policies can be dynamically enforced and updated without modifying the application code, providing flexibility in managing access and ensuring secure communication. With Istio, operators have access to authentication, authorization, and audit (AAA) tools. Istio supports various authentication mechanisms, including mutual TLS, JSON Web Tokens (JWT), and OAuth2, ensuring secure authentication of clients and services. Additionally, comprehensive auditing capabilities help operators track service behavior, comply with regulations, and detect potential security incidents. In summary, Istio's security solution addresses the specific security requirements of microservices, providing strong identity management, transparent TLS encryption, policy enforcement, and AAA tools. It enables operators to deploy secure applications and protect services and data within the service mesh. Code Sample YAML # Example DestinationRule for mutual TLS authentication apiVersion: networking.istio.io/v1alpha3 kind: DestinationRule metadata: name: my-service spec: host: my-service trafficPolicy: tls: mode: MUTUAL clientCertificate: /etc/certs/client.pem privateKey: /etc/certs/private.key caCertificates: /etc/certs/ca.pem # Example AuthorizationPolicy for access control apiVersion: security.istio.io/v1beta1 kind: AuthorizationPolicy metadata: name: my-service-access spec: selector: matchLabels: app: my-service rules: - from: - source: principals: ["cluster.local/ns/default/sa/my-allowed-service-account"] to: - operation: methods: ["*"] In the code above, we configure mutual TLS authentication for the my-service destination using a DestinationRule resource. We set the mode to MUTUAL to enforce mutual TLS authentication between clients and the service. The clientCertificate, privateKey, and caCertificates fields specify the paths to the client certificate, private key, and CA certificate, respectively. We also define an AuthorizationPolicy resource to control access to the my-service based on the source service account. In this example, we allow requests from the my-allowed-service-account service account in the default namespace by specifying its principal in the principals field. By applying these configurations to an Istio-enabled Kubernetes cluster, you can enhance the security of your microservices by enforcing mutual TLS authentication and implementing fine-grained access controls. Circuit Breaking and Retry Circuit breaking and retries are crucial techniques in building resilient distributed systems, especially in microservices architectures. Circuit breaking prevents cascading failures by stopping requests to a service experiencing errors or high latency. Istio's CircuitBreaker resource allows you to define thresholds for failed requests and other error conditions, ensuring that the circuit opens and stops further degradation when these thresholds are crossed. This isolation protects other services from being affected. Additionally, Istio's Retry resource enables automatic retries of failed requests, with customizable backoff strategies, timeout periods, and triggering conditions. By retrying failed requests, transient failures can be handled effectively, increasing the chances of success. Combining circuit breaking and retries enhances the resilience of microservices, isolating failing services and providing resilient handling of intermittent issues. Configuration of circuit breaking and retries in Istio is done within the VirtualService resource, allowing for customization based on specific requirements. Overall, leveraging these features in Istio is essential for building robust and resilient microservices architectures, protecting against failures, and maintaining system reliability. In the code below, we configure circuit breaking and retries for my-service using the VirtualService resource. The retries section specifies that failed requests should be retried up to 3 times with a per-try timeout of 2 seconds. The retryOn field specifies the conditions under which retries should be triggered, such as 5xx server errors or connect failures. The fault section configures fault injection for the service. In this example, we introduce a fixed delay of 5 seconds for 50% of the requests and abort 10% of the requests with a 503 HTTP status code. The circuitBreaker section defines the circuit-breaking thresholds for the service. The example configuration sets the maximum number of connections to 100, maximum HTTP requests to 100, maximum pending requests to 10, sleep window to 5 seconds, and HTTP detection interval to 10 seconds. By applying this configuration to an Istio-enabled Kubernetes cluster, you can enable circuit breaking and retries for your microservices, enhancing resilience and preventing cascading failures. Code Sample YAML apiVersion: networking.istio.io/v1alpha3 kind: VirtualService metadata: name: my-service spec: hosts: - my-service http: - route: - destination: host: my-service subset: v1 retries: attempts: 3 perTryTimeout: 2s retryOn: 5xx,connect-failure fault: delay: fixedDelay: 5s percentage: value: 50 abort: httpStatus: 503 percentage: value: 10 circuitBreaker: simpleCb: maxConnections: 100 httpMaxRequests: 100 httpMaxPendingRequests: 10 sleepWindow: 5s httpDetectionInterval: 10s Canary Deployments Canary deployments with Istio offer a powerful strategy for releasing new features or updates to a subset of users or traffic while minimizing the risk of impacting the entire system. With Istio's traffic management capabilities, you can easily implement canary deployments by directing a fraction of the traffic to the new version or feature. Istio's VirtualService resource allows you to define routing rules based on percentages, HTTP headers, or other criteria to selectively route traffic. By gradually increasing the traffic to the canary version, you can monitor its performance and gather feedback before rolling it out to the entire user base. Istio also provides powerful observability features, such as distributed tracing and metrics collection, allowing you to closely monitor the canary deployment and make data-driven decisions. In case of any issues or anomalies, you can quickly roll back to the stable version or implement other remediation strategies, minimizing the impact on users. Canary deployments with Istio provide a controlled and gradual approach to releasing new features, ensuring that changes are thoroughly tested and validated before impacting the entire system, thus improving the overall reliability and stability of your applications. To implement canary deployments with Istio, we can use the VirtualService resource to define routing rules and gradually shift traffic to the canary version. Code Sample YAML apiVersion: networking.istio.io/v1alpha3 kind: VirtualService metadata: name: my-service spec: hosts: - my-service http: - route: - destination: host: my-service subset: stable weight: 90 - destination: host: my-service subset: canary weight: 10 In the code above, we configure the VirtualService to route 90% of the traffic to the stable version of the service (subset: stable) and 10% of the traffic to the canary version (subset: canary). The weight field specifies the distribution of traffic between the subsets. By applying this configuration, you can gradually increase the traffic to the canary version and monitor its behavior and performance. Istio's observability features, such as distributed tracing and metrics collection, can provide insights into the canary deployment's behavior and impact. If any issues or anomalies are detected, you can quickly roll back to the stable version by adjusting the traffic weights or implementing other remediation strategies. By leveraging Istio's traffic management capabilities, you can safely release new features or updates, gather feedback, and mitigate risks before fully rolling them out to your user base. Autoscaling Istio seamlessly integrates with Kubernetes' Horizontal Pod Autoscaler (HPA) to enable automated scaling of microservices based on various metrics, such as CPU or memory usage. By configuring Istio's metrics collection and setting up the HPA, you can ensure that your microservices scale dynamically in response to increased traffic or resource demands. Istio's metrics collection capabilities allow you to gather detailed insights into the performance and resource utilization of your microservices. These metrics can then be used by the HPA to make informed scaling decisions. The HPA continuously monitors the metrics and adjusts the number of replicas for a given microservice based on predefined scaling rules and thresholds. When the defined thresholds are crossed, the HPA automatically scales up or down the number of pods, ensuring that the microservices can handle the current workload efficiently. This automated scaling approach eliminates the need for manual intervention and enables your microservices to adapt to fluctuating traffic patterns or resource demands in real time. By leveraging Istio's integration with Kubernetes' HPA, you can achieve optimal resource utilization, improve performance, and ensure the availability and scalability of your microservices. Code Sample YAML apiVersion: autoscaling/v2beta2 kind: HorizontalPodAutoscaler metadata: name: my-service-hpa spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: my-service minReplicas: 1 maxReplicas: 10 metrics: - type: Resource resource: name: cpu targetAverageUtilization: 50 In the example above, the HPA is configured to scale the my-service deployment based on CPU usage. The HPA will maintain an average CPU utilization of 50% across all pods. By applying this configuration, Istio will collect metrics from your microservices, and the HPA will automatically adjust the number of replicas based on the defined scaling rules and thresholds. With this integration, your microservices can dynamically scale up or down based on traffic patterns and resource demands, ensuring optimal utilization of resources and improved performance. It’s important to note that the Istio integration with Kubernetes' HPA may require additional configuration and tuning based on your specific requirements and monitoring setup. Implementing Fault Injection and Chaos Testing With Istio Chaos fault injection with Istio is a powerful technique that allows you to test the resilience and robustness of your microservices architecture. Istio provides built-in features for injecting faults and failures into your system, simulating real-world scenarios, and evaluating how well your system can handle them. With Istio's Fault Injection feature, you can introduce delays, errors, aborts, or latency spikes to specific requests or services. By configuring VirtualServices and DestinationRules, you can selectively apply fault injection based on criteria such as HTTP headers or paths. By combining fault injection with observability features like distributed tracing and metrics collection, you can closely monitor the impact of injected faults on different services in real time. Chaos fault injection with Istio helps you identify weaknesses, validate error handling mechanisms, and build confidence in the resilience of your microservices architecture, ensuring the reliability and stability of your applications in production environments. Securing External Traffic Using Istio's Ingress Gateway Securing external traffic using Istio's Ingress Gateway is crucial for protecting your microservices architecture from unauthorized access and potential security threats. Istio's Ingress Gateway acts as the entry point for external traffic, providing a centralized and secure way to manage inbound connections. By configuring Istio's Ingress Gateway, you can enforce authentication, authorization, and encryption protocols to ensure that only authenticated and authorized traffic can access your microservices. Istio supports various authentication mechanisms such as JSON Web Tokens (JWT), mutual TLS (mTLS), and OAuth, allowing you to choose the most suitable method for your application's security requirements. Additionally, Istio's Ingress Gateway enables you to define fine-grained access control policies based on source IP, user identity, or other attributes, ensuring that only authorized clients can reach specific microservices. By leveraging Istio's powerful traffic management capabilities, you can also enforce secure communication between microservices within your architecture, preventing unauthorized access or eavesdropping. Overall, Istio's Ingress Gateway provides a robust and flexible solution for securing external traffic, protecting your microservices, and ensuring the integrity and confidentiality of your data and communications. Code Sample YAML apiVersion: networking.istio.io/v1alpha3 kind: Gateway metadata: name: my-gateway spec: selector: istio: ingressgateway servers: - port: number: 80 name: http protocol: HTTP hosts: - "*" In this example, we define a Gateway named my-gateway that listens on port 80 and accepts HTTP traffic from any host. The Gateway's selector is set to istio: ingressgateway, which ensures that it will be used as the Ingress Gateway for external traffic. Best Practices for Managing and Operating Istio in Production Environments When managing and operating Istio in production environments, there are several best practices to follow. First, it is essential to carefully plan and test your Istio deployment before production rollout, ensuring compatibility with your specific application requirements and infrastructure. Properly monitor and observe your Istio deployment using Istio's built-in observability features, including distributed tracing, metrics, and logging. Regularly review and update Istio configurations to align with your evolving application needs and security requirements. Implement traffic management cautiously, starting with conservative traffic routing rules and gradually introducing more advanced features like traffic splitting and canary deployments. Take advantage of Istio's traffic control capabilities to implement circuit breaking, retries, and timeout policies to enhance the resilience of your microservices. Regularly update and patch your Istio installation to leverage the latest bug fixes, security patches, and feature enhancements. Lastly, establish a robust backup and disaster recovery strategy to mitigate potential risks and ensure business continuity. By adhering to these best practices, you can effectively manage and operate Istio in production environments, ensuring the reliability, security, and performance of your microservices architecture. Conclusion In the evolving landscape of service-to-service communication, Istio, as a service mesh, has surfaced as an integral component, offering a robust and flexible solution for managing complex communication between microservices in a distributed architecture. Istio's capabilities extend beyond merely facilitating communication to providing comprehensive traffic management, enabling sophisticated routing rules, retries, failovers, and fault injections. It also addresses security, a critical aspect in the microservices world, by implementing it at the infrastructure level, thereby reducing the burden on application code. Furthermore, Istio enhances observability in the system, allowing organizations to effectively monitor and troubleshoot their services. Despite the steep learning curve associated with Istio, the multitude of benefits it offers makes it a worthy investment for organizations. The control and flexibility it provides over microservices are unparalleled. With the growing adoption of microservices, the role of service meshes like Istio is becoming increasingly pivotal, ensuring reliable, secure operation of services, and providing the scalability required in today's dynamic business environment. In conclusion, Istio holds a significant position in the service mesh realm, offering a comprehensive solution for managing microservices at scale. It represents the ongoing evolution in service-to-service communication, driven by the need for more efficient, secure, and manageable solutions. The future of Istio and service mesh, in general, appears promising, with continuous research and development efforts aimed at strengthening and broadening their capabilities. References "What is a service mesh?" (Red Hat) "Istio - Connect, secure, control, and observe services." (Istio) "What is Istio?" (IBM Cloud) "Understanding the Basics of Service Mesh" (Container Journal)

By Pramodkumar Nedumpilli Ramakrishnan

Debugging Streams With Peek

I blogged about Java stream debugging in the past, but I skipped an important method that's worthy of a post of its own: peek. This blog post delves into the practicalities of using peek() to debug Java streams, complete with code samples and common pitfalls. Understanding Java Streams Java Streams represent a significant shift in how Java developers work with collections and data processing, introducing a functional approach to handling sequences of elements. Streams facilitate declarative processing of collections, enabling operations such as filter, map, reduce, and more in a fluent style. This not only makes the code more readable but also more concise compared to traditional iterative approaches. A Simple Stream Example To illustrate, consider the task of filtering a list of names to only include those that start with the letter "J" and then transforming each name into uppercase. Using the traditional approach, this might involve a loop and some "if" statements. However, with streams, this can be accomplished in a few lines: List<String> names = Arrays.asList("John", "Jacob", "Edward", "Emily"); // Convert list to stream List<String> filteredNames = names.stream() // Filter names that start with "J" .filter(name -> name.startsWith("J")) // Convert each name to uppercase .map(String::toUpperCase) // Collect results into a new list .collect(Collectors.toList()); System.out.println(filteredNames); Output: [JOHN, JACOB] This example demonstrates the power of Java streams: by chaining operations together, we can achieve complex data transformations and filtering with minimal, readable code. It showcases the declarative nature of streams, where we describe what we want to achieve rather than detailing the steps to get there. What Is the peek() Method? At its core, peek() is a method provided by the Stream interface, allowing developers a glance into the elements of a stream without disrupting the flow of its operations. The signature of peek() is as follows: Stream<T> peek(Consumer<? super T> action) It accepts a Consumer functional interface, which means it performs an action on each element of the stream without altering them. The most common use case for peek() is logging the elements of a stream to understand the state of data at various points in the stream pipeline. To understand peek, let's look at a sample similar to the previous one: List<String> collected = Stream.of("apple", "banana", "cherry") .filter(s -> s.startsWith("a")) .collect(Collectors.toList()); System.out.println(collected); This code filters a list of strings, keeping only the ones that start with "a". While it's straightforward, understanding what happens during the filter operation is not visible. Debugging With peek() Now, let's incorporate peek() to gain visibility into the stream: List<String> collected = Stream.of("apple", "banana", "cherry") .peek(System.out::println) // Logs all elements .filter(s -> s.startsWith("a")) .peek(System.out::println) // Logs filtered elements .collect(Collectors.toList()); System.out.println(collected); By adding peek() both before and after the filter operation, we can see which elements are processed and how the filter impacts the stream. This visibility is invaluable for debugging, especially when the logic within the stream operations becomes complex. We can't step over stream operations with the debugger, but peek() provides a glance into the code that is normally obscured from us. Uncovering Common Bugs With peek() Filtering Issues Consider a scenario where a filter condition is not working as expected: List<String> collected = Stream.of("apple", "banana", "cherry", "Avocado") .filter(s -> s.startsWith("a")) .collect(Collectors.toList()); System.out.println(collected); Expected output might be ["apple"], but let's say we also wanted "Avocado" due to a misunderstanding of the startsWith method's behavior. Since "Avocado" is spelled with an upper case "A" this code will return false: Avocado".startsWith("a"). Using peek(), we can observe the elements that pass the filter: List<String> debugged = Stream.of("apple", "banana", "cherry", "Avocado") .peek(System.out::println) .filter(s -> s.startsWith("a")) .peek(System.out::println) .collect(Collectors.toList()); System.out.println(debugged); Large Data Sets In scenarios involving large datasets, directly printing every element in the stream to the console for debugging can quickly become impractical. It can clutter the console and make it hard to spot the relevant information. Instead, we can use peek() in a more sophisticated way to selectively collect and analyze data without causing side effects that could alter the behavior of the stream. Consider a scenario where we're processing a large dataset of transactions, and we want to debug issues related to transactions exceeding a certain threshold: class Transaction { private String id; private double amount; // Constructor, getters, and setters omitted for brevity } List<Transaction> transactions = // Imagine a large list of transactions // A placeholder for debugging information List<Transaction> highValueTransactions = new ArrayList<>(); List<Transaction> processedTransactions = transactions.stream() // Filter transactions above a threshold .filter(t -> t.getAmount() > 5000) .peek(t -> { if (t.getAmount() > 10000) { // Collect only high-value transactions for debugging highValueTransactions.add(t); } }) .collect(Collectors.toList()); // Now, we can analyze high-value transactions separately, without overloading the console System.out.println("High-value transactions count: " + highValueTransactions.size()); In this approach, peek() is used to inspect elements within the stream conditionally. High-value transactions that meet a specific criterion (e.g., amount > 10,000) are collected into a separate list for further analysis. This technique allows for targeted debugging without printing every element to the console, thereby avoiding performance degradation and clutter. Addressing Side Effects Streams shouldn't have side effects. In fact, such side effects would break the stream debugger in IntelliJ which I have discussed in the past. It's crucial to note that while collecting data for debugging within peek() avoids cluttering the console, it does introduce a side effect to the stream operation, which goes against the recommended use of streams. Streams are designed to be side-effect-free to ensure predictability and reliability, especially in parallel operations. Therefore, while the above example demonstrates a practical use of peek() for debugging, it's important to use such techniques judiciously. Ideally, this debugging strategy should be temporary and removed once the debugging session is completed to maintain the integrity of the stream's functional paradigm. Limitations and Pitfalls While peek() is undeniably a useful tool for debugging Java streams, it comes with its own set of limitations and pitfalls that developers should be aware of. Understanding these can help avoid common traps and ensure that peek() is used effectively and appropriately. Potential for Misuse in Production Code One of the primary risks associated with peek() is its potential for misuse in production code. Because peek() is intended for debugging purposes, using it to alter state or perform operations that affect the outcome of the stream can lead to unpredictable behavior. This is especially true in parallel stream operations, where the order of element processing is not guaranteed. Misusing peek() in such contexts can introduce hard-to-find bugs and undermine the declarative nature of stream processing. Performance Overhead Another consideration is the performance impact of using peek(). While it might seem innocuous, peek() can introduce a significant overhead, particularly in large or complex streams. This is because every action within peek() is executed for each element in the stream, potentially slowing down the entire pipeline. When used excessively or with complex operations, peek() can degrade performance, making it crucial to use this method judiciously and remove any peek() calls from production code after debugging is complete. Side Effects and Functional Purity As highlighted in the enhanced debugging example, peek() can be used to collect data for debugging purposes, but this introduces side effects to what should ideally be a side-effect-free operation. The functional programming paradigm, which streams are a part of, emphasizes purity and immutability. Operations should not alter state outside their scope. By using peek() to modify external state (even for debugging), you're temporarily stepping away from these principles. While this can be acceptable for short-term debugging, it's important to ensure that such uses of peek() do not find their way into production code, as they can compromise the predictability and reliability of your application. The Right Tool for the Job Finally, it's essential to recognize that peek() is not always the right tool for every debugging scenario. In some cases, other techniques such as logging within the operations themselves, using breakpoints and inspecting variables in an IDE, or writing unit tests to assert the behavior of stream operations might be more appropriate and effective. Developers should consider peek() as one tool in a broader debugging toolkit, employing it when it makes sense and opting for other strategies when they offer a clearer or more efficient path to identifying and resolving issues. Navigating the Pitfalls To navigate these pitfalls effectively: Reserve peek() strictly for temporary debugging purposes. If you have a linter as part of your CI tools, it might make sense to add a rule that blocks code from invoking peek(). Always remove peek() calls from your code before committing it to your codebase, especially for production deployments. Be mindful of performance implications and the potential introduction of side effects. Consider alternative debugging techniques that might be more suited to your specific needs or the particular issue you're investigating. By understanding and respecting these limitations and pitfalls, developers can leverage peek() to enhance their debugging practices without falling into common traps or inadvertently introducing problems into their codebases. Final Thoughts The peek() method offers a simple yet effective way to gain insights into Java stream operations, making it a valuable tool for debugging complex stream pipelines. By understanding how to use peek() effectively, developers can avoid common pitfalls and ensure their stream operations perform as intended. As with any powerful tool, the key is to use it wisely and in moderation. The true value of peek() is in debugging massive data sets, these elements are very hard to analyze even with dedicated tools. By using peek() we can dig into the said data set and understand the source of the issue programmatically.

By Shai Almog

CORE