Enterprise AI Trend Report: Gain insights on ethical AI, MLOps, generative AI, large language models, and much more.
2024 Cloud survey: Share your insights on microservices, containers, K8s, CI/CD, and DevOps (+ enter a $750 raffle!) for our Trend Reports.
Programming languages allow us to communicate with computers, and they operate like sets of instructions. There are numerous types of languages, including procedural, functional, object-oriented, and more. Whether you’re looking to learn a new language or trying to find some tips or tricks, the resources in the Languages Zone will give you all the information you need and more.
Exploring Text Generation With Python and GPT-4
Python Function Pipelines: Streamlining Data Processing
Generative AI has become a powerful tool for creating new and innovative content, from captivating poems to photorealistic images. But where do you begin when you start learning in this exciting area? Python, with its robust libraries and active community, stands as a perfect starting point. This article delves into some of the most popular Python tools for generative AI, equipping you with the knowledge and code examples to kickstart your creative journey. 1. Text Generation With Transformers The Transformers library, built on top of PyTorch, offers a convenient way to interact with pre-trained language models like GPT-2. These models, trained on massive datasets of text and code, can generate realistic and coherent text continuations. Here's an example of using the transformers library to generate creative text: Python from transformers import GPT2Tokenizer, GPT2LMHeadModel # Load the pre-trained model and tokenizer tokenizer = GPT2Tokenizer.from_pretrained("gpt2") model = GPT2LMHeadModel.from_pretrained("gpt2") # Define the starting prompt prompt = "Once upon a time, in a land far, far away..." # Encode the prompt and generate text encoded_prompt = tokenizer.encode(prompt, return_tensors="pt") output = model.generate(encoded_prompt, max_length=100, num_beams=5) # Decode the generated text generated_text = tokenizer.decode(output[0], skip_special_tokens=True) # Print the generated text print(prompt + generated_text) So, it first loads the pre-trained GPT-2 model and tokenizer from the Hugging Face model hub. The prompt, acting as a seed, is then encoded into a format the model understands. The generate function takes this encoded prompt and generates a sequence of words with a maximum length of 100 and a beam search of 5, exploring different potential continuations. Finally, the generated text is decoded back into human-readable format and printed alongside the original prompt. 2. Image Generation With Diffusers Diffusers, another library built on PyTorch, simplifies experimentation with image diffusion models. These models, starting with random noise, iteratively refine the image to match a user-provided text description. Here's an example using Diffusers to generate an image based on a text prompt: Python from diffusers import StableDiffusionPipeline # Define the text prompt prompt = "A majestic eagle soaring through a clear blue sky" # Load the Stable Diffusion pipeline pipe = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5") # Generate the image image = pipe(prompt=prompt, num_inference_steps=50) # Save the generated image image.images[0].save("eagle.png") It defines a text prompt describing the desired image. The Stable Diffusion pipeline is then loaded, and the prompt is passed to the pipe function. The num_inference_steps parameter controls the number of iterations the model takes to refine the image, with more steps generally leading to higher fidelity. Finally, the generated image is saved as a PNG file. 2.1 Image Generation: Painting With Pixels Using StyleGAN2 Stepping into the domain of image generation, StyleGAN2, an NVIDIA project, empowers you to create photorealistic images with remarkable control over style. Here's a glimpse into using StyleGAN2: Python # Install StyleGAN2 library (instructions on official website) import stylegan2_pytorch as sg2 # Load a pre-trained model (e.g., FFHQ) generator = sg2.Generator(ckpt="ffhq.pkl") # Define a random latent vector as the starting point latent_vector = sg2.sample_latent(1) # Generate the image generated_image = generator(latent_vector) # Display or save the generated image using libraries like OpenCV or PIL After installation (refer to the official website for detailed instructions), you load a pre-trained model like "ffhq" representing human faces. The sample_latent function generates a random starting point, and the generator model transforms it into an image. 3. Code Completion With Gradio Gradio isn't solely for generative AI, but it can be a powerful tool for interacting with and showcasing these models. Here's an example of using Gradio to create a simple code completion interface: Python from transformers import AutoTokenizer, AutoModelForSequenceClassification # Load the pre-trained code completion model tokenizer = AutoTokenizer.from_pretrained("openai/code-davinci-003") model = AutoModelForSequenceClassification.from_pretrained("openai/code-davinci-003") def complete_code(code): """Completes the provided code snippet.""" encoded_input = tokenizer(code, return_tensors="pt") output = model(**encoded_input) return tokenizer.decode(output.logits.squeeze().argmax(-1), skip_special_tokens=True) # Create the Gradio interface interface = gradio.Interface(complete_code, inputs="text", outputs="text", title="Code Completion") # Launch the interface interface.launch() It utilizes a pre-trained code completion model from OpenAI. The complete_code function takes a code snippet as input, encodes it, and then uses the model to predict the most likely continuation. The predicted continuation is decoded and returned. Gradio is then used to create a simple interface where users can enter code and see the suggested completions. To summarize, the Python ecosystem offers a rich set of tools for exploring and utilizing the power of generative AI. From established libraries like TensorFlow and PyTorch to specialized offerings like Diffusers and StyleGAN, developers have a diverse toolkit at their disposal for tackling various generative tasks. As the field continues to evolve, we can expect even more powerful and user-friendly tools to emerge, further democratizing the access and application of generative AI for diverse purposes.
As part of learning the Rust ecosystem, I dedicated the last few days to error management. Here are my findings. Error Management 101 The Rust book describes the basics of error management. The language separates between recoverable errors and unrecoverable ones. Unrecoverable errors benefit from the panic!() macro. When Rust panics, it stops the program. Recoverable errors are much more enjoyable. Rust uses the Either monad, which stems from Functional Programming. Opposite to exceptions in other languages, FP mandates to return a structure that may contain either the requested value or the error. The language models it as an enum with generics on each value: Rust #[derive(Copy, PartialEq, PartialOrd, Eq, Ord, Debug, Hash)] pub enum Result<T, E> { Ok(T), Err(E), } Because Rust manages completeness of matches, matching on Result enforces that you handle both branches: Rust match fn_that_returns_a_result() { Ok(value) => do_something_with_value(value) Err(error) => handle_error(error) } If you omit one of the two branches, compilation fails. The above code is safe if unwieldy. But Rust offers a full-fledged API around the Result struct. The API implements the monad paradigm. Propagating Results Propagating results and errors is one of the main micro-tasks in programming. Here's a naive way to approach it: Rust #[derive(Debug)] struct Foo {} #[derive(Debug)] struct Bar { foo: Foo } #[derive(Debug)] struct MyErr {} fn main() { print!("{:?}", a(false)); } fn a(error: bool) -> Result<Bar, MyErr> { match b(error) { //1 Ok(foo) => Ok(Bar{ foo }), //2 Err(my_err) => Err(my_err) //3 } } fn b(error: bool) -> Result<Foo, MyErr> { if error { Err(MyErr {}) } else { Ok(Foo {}) } } Return a Result which contains a Bar or a MyErr If the call is successful, unwrap the Foo value, wrap it again, and return it If it isn't, unwrap the error, wrap it again, and return it The above code is a bit verbose, and because this construct is quite widespread, Rust offers the ? operator: When applied to values of the Result type, it propagates errors. If the value is Err(e), then it will return Err(From::from(e)) from the enclosing function or closure. If applied to Ok(x), then it will unwrap the value to evaluate to x. —The question mark operator We can apply it to the above a function: Rust fn a(error: bool) -> Result<Bar, MyErr> { let foo = b(error)?; Ok(Bar{ foo }) } The Error Tait Note that Result enforces no bound on the right type, the "error" type. However, Rust provides an Error trait. Two widespread libraries help us manage our errors more easily. Let's detail them in turn. Implement Error Trait With thiserror In the above section, I described how a struct could implement the Error trait. However, doing so requires quite a load of boilerplate code. The thiserror crate provides macros to write the code for us. Here's the documentation sample: Rust #[derive(Error, Debug)] //1 pub enum DataStoreError { #[error("data store disconnected")] //2 Disconnect(#[from] io::Error), #[error("the data for key `{0}` is not available")] //3 Redaction(String), #[error("invalid header (expected {expected:?}, found {found:?})")] //4 InvalidHeader { expected: String, found: String, } } Base Error macro Static error message Dynamic error message, using field index Dynamic error message, using field name thiserror helps you generate your errors. Propagate Result With anyhow The anyhow crate offers several features: A custom anyhow::Result struct. I will focus on this one A way to attach context to a function returning an anyhow::Result Additional backtrace environment variables Compatibility with thiserror A macro to create errors on the fly Result propagation has one major issue: functions signature across unrelated error types. The above snippet used a single enum, but in real-world projects, errors may come from different crates. Here's an illustration: Rust #[derive(thiserror::Error, Debug)] pub struct ErrorX {} //1 #[derive(thiserror::Error, Debug)] pub struct ErrorY {} //1 fn a(flag: i8) -> Result<Foo, Box<dyn std::error::Error>> { //2 match flag { 1 => Err(ErrorX{}.into()), //3 2 => Err(ErrorY{}.into()), //3 _ => Ok(Foo{}) } } Two error types, each implemented with a different struct with thiserror Rust needs to know the size of the return type at compile time. Because the function can return either one or the other type, we must return a fixed-sized pointer; that's the point of the Box construct. For a discussion on when to use Box compared to other constructs, please read this StackOverflow question. To wrap the struct into a Box, we rely on the into() method With anyhow, we can simplify the above code: Rust fn a(flag: i8) -> anyhow::Result<Foo> { match flag { 1 => Err(ErrorX{}.into()), 2 => Err(ErrorY{}.into()), _ => Ok(Foo{}) } With the Context trait, we can improve the user experience with additional details. The with_context() method is evaluated lazily, while the context() is evaluated eagerly. Here's how you can use the latter: Rust fn a(flag: i8) -> anyhow::Result<Bar> { let foo = b(flag).context(format!("Oopsie! {}", flag))?; //1 Ok(Bar{ foo }) } fn b(flag: i8) -> anyhow::Result<Foo> { match flag { 1 => Err(ErrorX{}.into()), 2 => Err(ErrorY{}.into()), _ => Ok(Foo{}) } } If the function fails, print the additional Oopsie! error message with the flag value Conclusion Rust implements error handling via the Either monad of FP and the Result enum. Managing such code in bare Rust requires boilerplate code. The thiserror crate can easily implement the Error trait for your structs, while anyhow simplifies function and method signatures. To Go Further Rust error handling The Error trait anyhow crate thiserror crate What is the difference between context and with_context in anyhow? Error handling across different languages
Python is a popular, versatile, and easy-to-learn programming language that offers excellent career options to individuals such as Python developers, full-stack developers, software engineers, cyber security experts, data analysts, etc. It is an excellent language to begin your career in IT. The Python programming language helps individuals develop the frontend and backend and work with the database. Good Python coding skills allow professionals to execute all Python-related works efficiently and conveniently. If you also want to excel with Python, you must give special attention to your coding skills. In this blog, we'll share the 10 best tips to help you improve your Python coding skills. Tips To Improve Python Coding Skills 1. Understand the Basics To learn any language, it's essential that you first get a command over the basics before learning complex concepts. The same goes with Python coding language. You must understand the basics of Python, including syntax, data types, and structures. To gain an understanding of the basics of the Python programming language proficiently, you can take the help of online tutorials and interactive platforms. In addition, you can join Python fundamental classes and take the help of beginner-friendly Python books. 2. Practice Regularly Practice is crucial to achieve success. To enhance your Python coding skills, maintain consistency. Figure out a time dedicated to investing in writing code, solving problems, and working on small projects. Platforms like HackerRank, LeetCode, and Codewars offer a variety of coding challenges suitable for professionals with different skill levels. You can use these platforms to enhance your Python coding skills. 3. Read Code Python is an easy-to-learn and read language. It's similar to the English language. You can leverage this benefit and read code written in Python by experienced developers. You can analyze open-source projects on GitHub or explore Python libraries to understand how others structure their code. This exposure will give you insights into best practices and different coding styles. Reading will improve your speed of Python learning. 4. Be With Like-Minded People What can be better than surrounding yourself with people who equally enjoy and practice Python? Doing this will create healthy competition to grow and learn. You can join the Python community. The Python community allows you to ask questions, seek guidance, and share your existing knowledge with other community members. Platforms like Stack Overflow, Reddit, and Python forums are great places to connect with programming language learners and experienced developers. 5. Learn From Mistakes Taking lessons from mistakes is applicable everywhere, including the learning process of Python. Learn from the mistakes you make or observe while solving any problem. Debugging is a critical skill in the software development field. By improving this skill, you can become an excellent Python coder. In addition, you can read case studies and take the help of other texts or audio-visual materials for learning. In addition, consider others' mistakes when learning Python. 6. Build Projects Just learning from different places won't help you for long. You need to apply what you learned in the real world, and for this, you need to work on the project. You can develop your projects, whether it's a simple web scraper, a data analysis tool, a game, or building projects. Building the projects will give you confidence and offer you experience. 7. Contribute to the Open-Source Platform Contributing to an open-source platform will increase your coding skills and help build a portfolio. Various open-source projects are available; you can collaborate and contribute to those projects and enrich your learning and Python coding skills. Different companies also release open-source projects and take contributions from Python developers. Making contributions to such projects can help you grow exponentially. 8. Join an Internship/Entry-Level Job You can also join internships to implement your learning in real-time projects. The internships will offer you guidance from seniors, allow you to work on projects with professionals, and more. In addition, if you feel confident in your skills, you can join an entry-level Python job and work on improving your coding skills. 9. Collaborate With Professionals You can use professional platforms like Linkedin or Reddit to connect with professionals. With these professionals, you can collaborate and work on projects online. This step will help you perform your skills and learning for the projects under expert guidance. Collaborating with professionals will also allow you to enhance or correct your existing learning about Python and keep you up-to-date. 10. Keep Learning The best way to learn or master any skill is to keep learning and improving with up-to-date information, knowledge, and experience. To master the coding skills, also keep learning. You can get help from various online resources to maintain learning consistency. In addition, you can attend workshops, seminars, etc., to learn the skill. Moreover, you can teach or explain concepts to others to improve your Python coding skills. Conclusion You can improve your Python coding skills with your practice, real project experience, consistency, and dedication; however, if you require help to develop this skill. There are various classes and courses available that you can join. Python is among the top programming languages, and investing your time in its learning can benefit you a lot, especially when you're looking to build your career with Python.
Creating an OAuth 2.0 Authorization Server from scratch involves understanding the OAuth 2.0 framework and implementing its various components, such as the authorization endpoint, token endpoint, and client registration. In this detailed guide, we'll walk through building a simple OAuth 2.0 Authorization Server using Python 3 and Flask, a popular web framework. This server will handle basic OAuth flows, including client registration, authorization code flow, and issuing access tokens. Setting Up Your Environment First, ensure you have Python 3 installed on your system. You'll also need pip for installing Python packages. 1. Create a Virtual Environment PowerShell python3 -m venv oauth-server-env source oauth-server-env/bin/activate # On Windows, use `oauth-server-env\Scripts\activate` 2. Install Flask PowerShell pip install Flask 3. Install Other Required Packages PowerShell pip install Flask-HTTPAuth PyJWT Flask-HTTPAuth will help with client authentication, and PyJWT is used for creating JSON Web Tokens (JWT), which will serve as our access tokens. Project Structure Create a new directory for your project (oauth_server) and create the following files: app.py: The main application file client_registry.py: A simple registry for storing client details auth_server.py: Contains the logic for the OAuth 2.0 Authorization Server Implementing the Client Registry Let's start by implementing a basic client registry. This registry will store client details and provide methods to register new clients and validate client credentials. client_registry.py Python import uuid clients = {} def register_client(client_name): client_id = str(uuid.uuid4()) client_secret = str(uuid.uuid4()) clients[client_id] = {'client_secret': client_secret, 'client_name': client_name} return client_id, client_secret def validate_client(client_id, client_secret): return client_id in clients and clients[client_id]['client_secret'] == client_secret This registry uses Python's built-in uuid library to generate unique identifiers for client IDs and secrets. Building the Authorization Server Next, we'll implement the OAuth 2.0 endpoints in our authorization server. auth_server.py Python from flask import Flask, request, jsonify, redirect, url_for from client_registry import register_client, validate_client import jwt import datetime app = Flask(__name__) @app.route('/register', methods=['POST']) def register(): client_name = request.json.get('client_name') client_id, client_secret = register_client(client_name) return jsonify({'client_id': client_id, 'client_secret': client_secret}) @app.route('/authorize') def authorize(): # In a real application, you would validate the user here client_id = request.args.get('client_id') redirect_uri = request.args.get('redirect_uri') # Generate an authorization code auth_code = str(uuid.uuid4()) # Redirect back to the client with the auth code return redirect(f"{redirect_uri}?code={auth_code}") @app.route('/token', methods=['POST']) def token(): auth_code = request.form.get('code') client_id = request.form.get('client_id') client_secret = request.form.get('client_secret') if not validate_client(client_id, client_secret): return jsonify({'error': 'invalid_client'}), 401 # In a real application, you'd validate the auth code here # Generate JWT as access token payload = { 'client_id': client_id, 'exp': datetime.datetime.utcnow() + datetime.timedelta(minutes=30) # 30 min expiry } access_token = jwt.encode(payload, 'secret', algorithm='HS256') return jsonify({'access_token': access_token}) if __name__ == '__main__': app.run(debug=True) This server provides three endpoints: /register: Allows clients to register and receive their client_id and client_secret /authorize: Simulates the authorization step where a user approves the client; In a real application, this would involve user authentication and consent. /token: Exchanges an authorization code for an access token; Here, we use JWT as our access token format. Running Your Authorization Server Execute your application: PowerShell python app.py Your OAuth 2.0 Authorization Server is now running and ready to handle requests. Testing the Flow Testing the OAuth 2.0 authorization flow within a web application involves a sequence of steps to ensure that the integration not only adheres to the OAuth 2.0 specifications but also secures user data effectively. The goal of testing is to simulate the entire OAuth process from initiation to the acquisition of the access token, and optionally the refresh token, mimicking the actions a real user would take. Setting Up the Test Environment Before testing the OAuth 2.0 authorization flow, ensure your test environment is properly set up. This includes: Authorization Server: Your OAuth 2.0 Authorization Server should be fully configured and running. For Python3 applications, this could be a Flask or Django app that you've set up as per OAuth 2.0 specifications. Client application: A client web application configured to request authorization from the server; This is typically another web application that will use OAuth for authentication. User accounts: Test user accounts on the authorization server with predefined permissions or roles Secure connection: Ensure all communications are over HTTPS, even in your testing environment, to simulate real-world conditions accurately. Step-by-Step Testing Process 1. Initiate the Authorization Request From the client application, initiate an authorization request. This usually involves directing the user's browser to the authorization URL constructed with the necessary query parameters like response_type, client_id, redirect_uri, scope, and an optional state parameter for CSRF protection. 2. Authenticate and Authorize The user should be redirected to the authorization server's login page, where they can enter their credentials. After successful authentication, the user should be shown a consent screen (if applicable) where they can authorize the requested permissions. Verify that the state parameter, if used, is correctly validated by the client upon redirection. 3. Authorization Response Upon user consent, the authorization server should redirect the user back to the client application using the redirect_uri provided, including an authorization code in the query parameters. Ensure that the redirection to the redirect_uri occurs correctly and the authorization code is present in the request. 4. Token Exchange The client application should then exchange the authorization code for an access token (and optionally a refresh token) by making a request to the token endpoint of the authorization server. Verify that the token exchange requires authentication (client ID and secret) and is done over HTTPS. Check that the access token (and refresh token, if applicable) is returned in the response. 5. Access Token Use The client application should use the access token to make authenticated requests to the resource server on behalf of the user. Ensure that resources can be accessed using the access token and that requests without a valid access token are denied. 6. Refresh Token Process (If Applicable) If a refresh token was issued, simulate the expiration of the access token and use the refresh token to obtain a new access token. Validate that the new access token grants access to the resources as expected. Testing for Security and Compliance Invalid Request Handling Test how the authorization server handles invalid requests, such as missing parameters or incorrect client credentials. The server should respond with an appropriate error response. Token Revocation and Expiry Ensure that expired or revoked tokens are correctly handled and do not grant access to protected resources. Scope Validation Verify that the scope of the access tokens is respected and that tokens only grant access to the resources they are supposed to. Automated Testing Tools Consider using automated testing tools and frameworks designed for OAuth 2.0 to streamline the testing process. Tools like Postman, OAuth 2.0 Test Server (provided by OAuth.tools), and custom scripts can automate many of the steps involved in testing the OAuth flow. Preparing the Testing Environment Authorization Server setup: Assume we have a Flask-based OAuth 2.0 Authorization Server running at http://127.0.0.1:5000. Client application setup: A Django web application that acts as the OAuth 2.0 client, registered with the Authorization Server with the client_id of abc123 and a client_secret. Secure connection: Both applications enforce HTTPS Test 1: Initiate the Authorization Request The client application redirects the user to the authorization server with a URL structured like so: Plain Text http://127.0.0.1:5000/oauth/authorize?response_type=code&client_id=abc123&redirect_uri=https://client.example.com/callback&scope=read&state=xyz Expected Outcome The user is redirected to the login page of the authorization server. Detailed Steps and Checks Redirection to Authorization Server: Ensure the user's browser is redirected to the correct URL on the authorization server. Correct query parameters: Verify that the URL contains all necessary parameters (response_type, client_id, redirect_uri, scope, and state). Test 2: User Authentication and Authorization Upon redirection, the user logs in with their credentials and grants the requested permissions. Expected Outcome The authorization server redirects the user back to the client application's callback URL with an authorization code. Detailed Steps and Checks Login and consent: After logging in, check if the consent screen correctly displays the requested permissions (based on scope). Redirection with Authorization code: Ensure the server redirects to http://127.0.0.1:5000/callback?code=AUTH_CODE&state=xyz. State parameter validation: Confirm the client application validates the returned state parameter matches the one sent in the initial request. Test 3: Exchange Authorization Code for Access Token The client application exchanges the authorization code for an access token. Example Request Using Python's requests library, the client application makes a POST request to the token endpoint: Python import requests data = { 'grant_type': 'authorization_code', 'code': 'AUTH_CODE', 'redirect_uri': 'http://127.0.0.1:5000/callback', 'client_id': 'abc123', 'client_secret': 'secret' } response = requests.post('http://127.0.0.1:5000/oauth/token', data=data) Expected Outcome The authorization server responds with an access token. Detailed Steps and Checks Correct token endpoint: Verify the POST request is made to the correct token endpoint URL. Successful token response: Ensure the response includes an access_token (and optionally a refresh_token). Test 4: Using the Access Token The client application uses the access token to request resources from the resource server on behalf of the user. Example Resource Request Python headers = {'Authorization': f'Bearer {access_token}'} response = requests.get('http://127.0.0.1:5000/userinfo', headers=headers) Expected Outcome The resource server returns the requested user information. Detailed Steps and Checks Access token in Authorization header: Confirm the access token is included in the request header. Valid resource access: Check the response from the resource server containing the expected data. Test 5: Refreshing the Access Token Assuming a refresh_token was obtained, simulate the access token expiration, and refresh it. Example Refresh Request Python data = { 'grant_type': 'refresh_token', 'refresh_token': 'REFRESH_TOKEN', 'client_id': 'abc123', 'client_secret': 'secret' } response = requests.post('http://127.0.0.1:5000/oauth/token', data=data) Expected Outcome A new access token is issued by the authorization server. Detailed Steps and Checks Successful token refresh: Ensure the response contains a new access_token. Valid access with new token: Verify the new access token can access resources. Conclusion Testing the OAuth 2.0 authorization flow is crucial for ensuring the security and functionality of your web application's authentication mechanism. A thorough testing process not only validates the implementation against the OAuth 2.0 specification but also identifies potential security vulnerabilities that could compromise user data. By following a structured testing approach, you can assure users of a secure and seamless authentication experience.
Here's how to use AI and API Logic Server to create complete running systems in minutes: Use ChatGPT for Schema Automation: create a database schema from natural language. Use Open Source API Logic Server: create working software with one command. App Automation: a multi-page, multi-table admin app. API Automation: A JSON: API, crud for each table, with filtering, sorting, optimistic locking, and pagination. Customize the project with your IDE: Logic Automation using rules: declare spreadsheet-like rules in Python for multi-table derivations and constraints - 40X more concise than code. Use Python and standard libraries (Flask, SQLAlchemy) and debug in your IDE. Iterate your project: Revise your database design and logic. Integrate with B2B partners and internal systems. This process leverages your existing IT infrastructure: your IDE, GitHub, the cloud, your database… open source. Let's see how. 1. AI: Schema Automation You can use an existing database or create a new one with ChatGPT or your database tools. Use ChatGPT to generate SQL commands for database creation: Plain Text Create a sqlite database for customers, orders, items and product Hints: use autonum keys, allow nulls, Decimal types, foreign keys, no check constraints. Include a notes field for orders. Create a few rows of only customer and product data. Enforce the Check Credit requirement: Customer.Balance <= CreditLimit Customer.Balance = Sum(Order.AmountTotal where date shipped is null) Order.AmountTotal = Sum(Items.Amount) Items.Amount = Quantity * UnitPrice Store the Items.UnitPrice as a copy from Product.UnitPrice Note the hint above. As we've heard, "AI requires adult supervision." The hint was required to get the desired SQL. This creates standard SQL like this. Copy the generated SQL commands into a file, say, sample-ai.sql: Then, create the database: sqlite3 sample_ai.sqlite < sample_ai.sql 2. API Logic Server: Create Given a database (whether or not it's created from AI), API Logic Server creates an executable, customizable project with the following single command: $ ApiLogicServer create --project_name=sample_ai --db_url=sqlite:///sample_ai.sqlite This creates a project you can open with your IDE, such as VSCode (see below). The project is now ready to run; press F5. It reflects the automation provided by the create command: API Automation: a self-serve API ready for UI developers and; App Automation: an Admin app ready for Back Office Data Maintenance and Business User Collaboration. Let's explore the App and API Automation from the create command. App Automation App Automation means that ApiLogicServer create creates a multi-page, multi-table Admin App automatically. This does not consist of hundreds of lines of complex HTML and JavaScript; it's a simple yaml file that's easy to customize. Ready for business user collaboration,back-office data maintenance...in minutes. API Automation App Automation means that ApiLogicServer create creates a JSON: API automatically. Your API provides an endpoint for each table, with related data access, pagination, optimistic locking, filtering, and sorting. It would take days to months to create such an APIusing frameworks. UI App Developers can use the API to create custom apps immediately, using Swagger to design their API call and copying the URI into their JavaScript code. APIs are thus self-serve: no server coding is required. Custom App Dev is unblocked: Day 1. 3. Customize So, we have working software in minutes. It's running, but we really can't deploy it until we have logic and security, which brings us to customization. Projects are designed for customization, using standards: Python, frameworks (e.g., Flask, SQLAlchemy), and your IDE for code editing and debugging. Not only Python code but also Rules. Logic Automation Logic Automation means that you can declare spreadsheet-like rules using Python. Such logic maintains database integrity with multi-table derivations, constraints, and security. Rules are 40X more concise than traditional code and can be extended with Python. Rules are an executable design. Use your IDE (code completion, etc.) to replace 280 lines of code with the five spreadsheet-like rules below. Note they map exactly to our natural language design: 1. Debugging The screenshot above shows our logic declarations and how we debug them: Execution is paused at a breakpoint in the debugger, where we can examine the state and execute step by step. Note the logging for inserting an Item. Each line represents a rule firing and shows the complete state of the row. 2. Chaining: Multi-Table Transaction Automation Note that it's a Multi-Table Transaction, as indicated by the log indentation. This is because, like a spreadsheet, rules automatically chain, including across tables. 3. 40X More Concise The five spreadsheet-like rules represent the same logic as 200 lines of code, shown here. That's a remarkable 40X decrease in the backend half of the system. 4. Automatic Re-use The logic above, perhaps conceived for Place order, applies automatically to all transactions: deleting an order, changing items, moving an order to a new customer, etc. This reduces code and promotes quality (no missed corner cases). 5. Automatic Optimizations SQL overhead is minimized by pruning, and by eliminating expensive aggregate queries. These can result in orders of magnitude impact. This is because the rule engine is not based on a Rete algorithm but is highly optimized for transaction processing and integrated with the SQLAlchemy ORM (Object Relational Manager). 6. Transparent Rules are an executable design. Note they map exactly to our natural language design (shown in comments) readable by business users. This complements running screens to facilitate agile collaboration. Security Automation Security Automation means you activate login-access security and declare grants (using Python) to control row access for user roles. Here, we filter less active accounts for users with the sales role: Grant( on_entity = models.Customer, to_role = Roles.sales, filter = lambda : models.Customer.CreditLimit > 3000, filter_debug = "CreditLimit > 3000") 4. Iterate: Rules + Python So, we have completed our one-day project. The working screens and rules facilitate agile collaboration, which leads to agile iterations. Automation helps here, too: not only are spreadsheet-like rules 40X more concise, but they meaningfully simplify iterations and maintenance. Let’s explore this with two changes: Requirement 1: Green Discounts Plain Text Give a 10% discount for carbon-neutral products for 10 items or more. Requirement 2: Application Integration Plain Text Send new Orders to Shipping using a Kafka message. Enable B2B partners to place orders with a custom API. Revise Data Model In this example, a schema change was required to add the Product.CarbonNeutral column. This affects the ORM models, the API, etc. So, we want these updated but retain our customizations. This is supported using the ApiLogicServer rebuild-from-database command to update existing projects to a revised schema, preserving customizations. Iterate Logic: Add Python Here is our revised logic to apply the discount and send the Kafka message: Extend API We can also extend our API for our new B2BOrder endpoint using standard Python and Flask: Note: Kafka is not activated in this example. To explore a running Tutorial for application integration with running Kafka, click here. Notes on Iteration This illustrates some significant aspects of how logic supports iteration. Maintenance Automation Along with perhaps documentation, one of the tasks programmers most loathe is maintenance. That’s because it’s not about writing code, but archaeology; deciphering code someone else wrote, just so you can add four or five lines that’ll hopefully be called and function correctly. Logic Automation changes that with Maintenance Automation, which means: Rules automatically order their execution (and optimizations) based on system-discovered dependencies. Rules are automatically reused for all relevant transactions. So, to alter logic, you just “drop a new rule in the bucket,” and the system will ensure it’s called in the proper order and re-used over all the relevant Use Cases. Extensibility: With Python In the first case, we needed to do some if/else testing, and it was more convenient to add a dash of Python. While this is pretty simple Python as a 4GL, you have the full power of object-oriented Python and its many libraries. For example, our extended API leverages Flask and open-source libraries for Kafka messages. Rebuild: Logic Preserved Recall we were able to iterate the schema and use the ApiLogicServer rebuild-from-database command. This updates the existing project, preserving customizations. 5. Deploy API Logic Server provides scripts to create Docker images from your project. You can deploy these to the cloud or your local server. For more information, see here. Summary In minutes, you've used ChatGPT and API Logic Server to convert an idea into working software. It required only five rules and a few dozen lines of Python. The process is simple: Create the Schema with ChatGPT. Create the Project with ApiLogicServer. A Self-Serve API to unblock UI Developers: Day 1 An Admin App for Business User Collaboration: Day 1 Customize the project. With Rules: 40X more concise than code. With Python: for complete flexibility. Iterate the project in your IDE to implement new requirements. Prior customizations are preserved. It all works with standard tooling: Python, your IDE, and container-based deployment. You can execute the steps in this article with the detailed tutorial: click here.
It was a really snowy day when I started this. I saw the IBM WatsonX Python SDK and realized I needed to wire up my Gen AI Model (LLM) to send my context-augmented prompt from Slack. Why not create a Python Processor for Apache NiFi 2.0.0? I guess that won’t be hard. It was easy! IBM WatsonXAI has a huge list of powerful foundation models that you can choose from, just don't pick those v1 models as they are going to be removed in a few months. GitHub, IBM/watsonxdata-python-sdk: This is used for wastonx.data Python SDK. After we picked a model I tested it in WatsonX’s Prompt Lab. Then I ported it to a simple Python program. Once that worked I started adding the features like properties and the transform method. That’s it. Source Code Here is the link to the source code. Now we can drop our new LLM calling processor into a flow and use it as any other built-in processor. For example, the Python API requires that Python 3.9+ is available on the machine hosting NiFi. Package-Level Dependencies Add to requirements.txt. Basic Format for the Python Processor You need to import various things from the nifiapi library. You then set up your class, CallWatsonXAI. You need to include class Java definition and ProcessDetails that include NiFi version, dependencies, a description, and some tags. class ProcessorDetails: version = '0.0.1-SNAPSHOT', dependencies = ['pandas'] Define All The Properties For the Processor You need to set up PropertyDescriptors for each property that include things like a name, description, required, validators, expression_language_scope, and more. Transform Main Method Here we include the imports needed. You can access properties via context.getProperty. You can then set attributes for outputs as shown via attributes. We then set contents for Flow File output. And finally, relationship, which for all guide is success. You should add something to handle errors. I need to add that. If you need to, redeploy, debug, or fix something. While you may delete the entire work directory while NiFi is stopped, doing so may result in NiFi taking significantly longer to startup the next time, as it must source all extensions' dependencies from PyPI, as well as expand all Java extensions' NAR files. See: NiFi Python Developer's Guide So to deploy it, we just need to copy the Python file to the nifi-2.0.0/python/extensions directory and possibly restart your NiFi server(s). I would start developing locally on your laptop with either a local GitHub build or Docker. Now that we have written a processor, let's use it in a real-time streaming data pipeline application. Example Application Building off our previous application that receives Slack messages, we will take those Slack queries send them against PineCone or Chroma vector databases and take that context and send it along with our call to IBM’s WatsonX AI REST API for Generative AI (LLM). You can find those previous details here: Building a Real-Time Slackbot With Generative AI Codeless Generative AI Pipelines with Chroma Vector DB & Apache NiFi Streaming LLM with Apache NiFi (HuggingFace) Augmenting and Enriching LLM with Real-Time Context NiFi Flow Listen HTTP: On port 9518/slack; NiFi is a universal REST endpoint QueryRecord: JSON cleanup SplitJSON: $.* EvalJSONPath: Output attribute for $.inputs QueryChroma: Call server on port 9776 using ONNX model, export 25 Rows QueryRecord: JSON->JSON; Limit 1 SplitRecord: JSON->JSON; Into 1 row EvalJSONPath: Export the context from $.document ReplaceText: Make context the new Flow File UpdateAttribute: Update inputs CallWatsonX: Our Python processor to call IBM SplitRecord: 1 Record, JSON -> JSON EvalJSONPath: Add attributes AttributesToJSON: Make a new Flow file from attributes QueryRecord: Validate JSON UpdateRecord: Add generated text, inputs, ts, UUID Kafka Path, PublishKafkaRecord_2_6: Send results to Kafka. Kafka Path, RetryFlowFile: If Apache Kafka send fails, try again. Slack Path, SplitRecord : Split into 1 record for display. Slack Path, EvaluateJSONPath: Pull out fields to display. Slack Path, PutSlack : Send formatted message to #chat group. This is a full-fledged Retrieval Augmented Generation (RAG) application utilizing ChromaDB. (The NiFi flow can also use Pinecone. I am working on Milvus, SOLR, and OpenSearch next.) Enjoy how easy it is to add Python code to your distributed NiFi applications.
@dataclass is a decorator which is part of the Python dataclasses module. When the @dataclass decorator is used, it automatically generates special methods such as: _ _ init _ _.: Constructor to initialize fields _ _ repr _ _ : String representation of the object _ _ eq _ _ : Equality comparison between objects _ _ hash_ _ : Enables use as dictionary keys (if values are hashable) Along with the methods listed above, the @dataclass decorator has two important attributes. Order: If True, (the default is False), __lt__(), __le__(), __gt__(), and __ge__() methods will be generated; i.e., @dataclass (order = True). Immutability: Fields can be made immutable using the frozen=True parameter; i.e., @dataclass(frozen=True). In a nutshell, the primary goal of the @dataclass decorator is to simplify the creation of classes. Advantages of the dataclass Decorator Using the dataclass decorator has several advantages: Boilerplate reduction: It reduces the amount of boilerplate code needed for classes by automatically generating common special methods. Readability: It improves the readability of the code by making it more concise and focused on the data representation. Default values: You can provide default values for attributes directly in the class definition, reducing the need for explicit __init__() methods. Immutability: By combining @dataclass with the frozen=True option, you can create immutable data classes, ensuring that instances cannot be modified after creation. Usage Python from dataclasses import dataclass @dataclass class Person: name: str age: int In this example, the Person class is annotated with @dataclass, and two fields (name and age), are declared. The __init__(), __repr__(), __eq__(), and __hash__() methods are automatically generated. Here's an explanation of how to use each generated method: __init__(self, ...): The __init__ method is automatically generated with parameters corresponding to the annotated attributes. You can create instances of the class by providing values for the attributes. Python person = Person('Sam', 45) __repr__(self) -> str: The __repr__ method returns a string representation of the object, useful for debugging and logging. When you print an object or use it in an f-string, the __repr__ method is called. Python person # Person(name='Sam', age=45) __eq__(self, other) -> bool: The __eq__ method checks for equality between two objects based on their attributes. It is used when you compare objects using the equality operator (==). Python # Usage person1 = Person('Sam', 45) person1 person2 = Person('Sam', 46) person2 print(person1 == person2) # False. __hash__(self) -> int: The __hash__ method generates a hash value for the object, allowing instances to be used in sets and dictionaries. It is required when the class is used as a key in a dictionary or an element in a set. Ordering If you include the order=True option, additional ordering methods (__lt__, __le__, __gt__, and __ge__) are generated. These methods allow instances to be compared using less than, less than or equal, greater than, and greater than or equal operators. If you perform a comparison on the Person object without order, TypeError will be thrown. Python print(person1 < person2) # // TypeError: '<' not supported between instances of 'Person' and 'Person' After adding ordering, we can perform comparisons. Python @dataclass(order=True) class Person: name: str age: int # Usage person1 = Person('Sam', 45) person1 person2 = Person('Sam', 46) person2 print(person1 < person2) # False. order is False by default, meaning comparison methods are not generated unless explicitly enabled. Comparisons are based on field values, not object identities. Immutability @dataclass can be made immutable using the frozen=True attribute; the default is False. Python @dataclass class Person: name: str age: int person = Person('Sam', 45) person.name = 'Sam2' person # Person(name='Sam2', age=45) In the code above, we are able to reassign values to the Person name field. After adding frozen=True, the exception will be thrown and reassignment is not allowed. Python @dataclass(frozen=True) class Person: name: str age: int person = Person('Sam', 45) person.name = 'Sam2' # FrozenInstanceError: cannot assign to field 'name' Be aware of performance implications: frozen=True adds a slight overhead because of additional checks for immutability. Default Value Using the dataclasses module, we can assign the default value to the fields in the class definition. Python from dataclasses import dataclass, field @dataclass class Person: name: str age: int = field(default=20) # Usage person = Person('Sam') person # Person(name='Sam', age=20) Default values can be of any data type, including other data classes or mutable objects. They are evaluated only once when the class is defined, not each time an instance is created.
Last year, I wrote a post on OpenTelemetry Tracing to understand more about the subject. I also created a demo around it, which featured the following components: The Apache APISIX API Gateway A Kotlin/Spring Boot service A Python/Flask service And a Rust/Axum service I've recently improved the demo to deepen my understanding and want to share my learning. Using a Regular Database In the initial demo, I didn't bother with a regular database. Instead: The Kotlin service used the embedded Java H2 database The Python service used the embedded SQLite The Rust service used hard-coded data in a hash map I replaced all of them with a regular PostgreSQL database, with a dedicated schema for each. The OpenTelemetry agent added a new span when connecting to the database on the JVM and in Python. For the JVM, it's automatic when one uses the Java agent. One needs to install the relevant package in Python — see next section. OpenTelemetry Integrations in Python Libraries Python requires you to explicitly add the package that instruments a specific library for OpenTelemetry. For example, the demo uses Flask; hence, we should add the Flask integration package. However, it can become a pretty tedious process. Yet, once you've installed opentelemetry-distro, you can "sniff" installed packages and install the relevant integration. Shell pip install opentelemetry-distro opentelemetry-bootstrap -a install For the demo, it installs the following: Plain Text opentelemetry_instrumentation-0.41b0.dist-info opentelemetry_instrumentation_aws_lambda-0.41b0.dist-info opentelemetry_instrumentation_dbapi-0.41b0.dist-info opentelemetry_instrumentation_flask-0.41b0.dist-info opentelemetry_instrumentation_grpc-0.41b0.dist-info opentelemetry_instrumentation_jinja2-0.41b0.dist-info opentelemetry_instrumentation_logging-0.41b0.dist-info opentelemetry_instrumentation_requests-0.41b0.dist-info opentelemetry_instrumentation_sqlalchemy-0.41b0.dist-info opentelemetry_instrumentation_sqlite3-0.41b0.dist-info opentelemetry_instrumentation_urllib-0.41b0.dist-info opentelemetry_instrumentation_urllib3-0.41b0.dist-info opentelemetry_instrumentation_wsgi-0.41b0.dist-info The above setup adds a new automated trace for connections. Gunicorn on Flask Every time I started the Flask service, it showed a warning in red that it shouldn't be used in production. While it's unrelated to OpenTelemetry, and though nobody complained, I was not too fond of it. For this reason, I added a "real" HTTP server. I chose Gunicorn, for no other reason than because my knowledge of the Python ecosystem is still shallow. The server is a runtime concern. We only need to change the Dockerfile slightly: Dockerfile RUN pip install gunicorn ENTRYPOINT ["opentelemetry-instrument", "gunicorn", "-b", "0.0.0.0", "-w", "4", "app:app"] The -b option refers to binding; you can attach to a specific IP. Since I'm running Docker, I don't know the IP, so I bind to any. The -w option specifies the number of workers Finally, the app:app argument sets the module and the application, separated by a colon Gunicorn usage doesn't impact OpenTelemetry integrations. Heredocs for the Win You may benefit from this if you write a lot of Dockerfile. Every Docker layer has a storage cost. Hence, inside a Dockerfile, one tends to avoid unnecessary layers. For example, the two following snippets yield the same results. Dockerfile RUN pip install pip-tools RUN pip-compile RUN pip install -r requirements.txt RUN pip install gunicorn RUN opentelemetry-bootstrap -a install RUN pip install pip-tools \ && pip-compile \ && pip install -r requirements.txt \ && pip install gunicorn \ && opentelemetry-bootstrap -a install The first snippet creates five layers, while the second is only one; however, the first is more readable than the second. With heredocs, we can access a more readable syntax that creates a single layer: Dockerfile RUN <<EOF pip install pip-tools pip-compile pip install -r requirements.txt pip install gunicorn opentelemetry-bootstrap -a install EOF Heredocs are a great way to have more readable and more optimized Dockerfiles. Try them! Explicit API Call on the JVM In the initial demo, I showed two approaches: The first uses auto-instrumentation, which requires no additional action The second uses manual instrumentation with Spring annotations I wanted to demo an explicit call with the API in the improved version. The use-case is analytics and uses a message queue: I get the trace data from the HTTP call and create a message with such data so the subscriber can use it as a parent. First, we need to add the OpenTelemetry API dependency to the project. We inherit the version from the Spring Boot Starter parent POM: XML <dependency> <groupId>io.opentelemetry</groupId> <artifactId>opentelemetry-api</artifactId> </dependency> At this point, we can access the API. OpenTelemetry offers a static method to get an instance: Kotlin val otel = GlobalOpenTelemetry.get() At runtime, the agent will work its magic to return the instance. Here's a simplified class diagram focused on tracing: In turn, the flow goes something like this: Kotlin val otel = GlobalOpenTelemetry.get() //1 val tracer = otel.tracerBuilder("ch.frankel.catalog").build() //2 val span = tracer.spanBuilder("AnalyticsFilter.filter") //3 .setParent(Context.current()) //4 .startSpan() //5 // Do something here span.end() //6 Get the underlying OpenTelemetry Get the tracer builder and "build" the tracer Get the span builder Add the span to the whole chain Start the span End the span; after this step, send the data to the OpenTelemetry endpoint configured Adding a Message Queue When I did the talk based on the post, attendees frequently asked whether OpenTelemetry would work with messages such as MQ or Kafka. While I thought it was the case in theory, I wanted to make sure of it: I added a message queue in the demo under the pretense of analytics. The Kotlin service will publish a message to an MQTT topic on each request. A NodeJS service will subscribe to the topic. Attaching OpenTelemetry Data to the Message So far, OpenTelemetry automatically reads the context to find out the trace ID and the parent span ID. Whatever the approach, auto-instrumentation or manual, annotations-based or explicit, the library takes care of it. I didn't find any existing similar automation for messaging; we need to code our way in. The gist of OpenTelemetry is the traceparent HTTP header. We need to read it and send it along with the message. First, let's add MQTT API to the project. XML <dependency> <groupId>org.eclipse.paho</groupId> <artifactId>org.eclipse.paho.mqttv5.client</artifactId> <version>1.2.5</version> </dependency> Interestingly enough, the API doesn't allow access to the traceparent directly. However, we can reconstruct it via the SpanContext class. I'm using MQTT v5 for my message broker. Note that the v5 allows for metadata attached to the message; when using v3, the message itself needs to wrap them. JavaScript val spanContext = span.spanContext //1 val message = MqttMessage().apply { properties = MqttProperties().apply { val traceparent = "00-${spanContext.traceId}-${spanContext.spanId}-${spanContext.traceFlags}" //2 userProperties = listOf(UserProperty("traceparent", traceparent)) //3 } qos = options.qos isRetained = options.retained val hostAddress = req.remoteAddress().map { it.address.hostAddress }.getOrNull() payload = Json.encodeToString(Payload(req.path(), hostAddress)).toByteArray() //4 } val client = MqttClient(mqtt.serverUri, mqtt.clientId) //5 client.publish(mqtt.options, message) //6 Get the span context Construct the traceparent from the span context, according to the W3C Trace Context specification Set the message metadata Set the message body Create the client Publish the message Getting OpenTelemetry Data From the Message The subscriber is a new component based on NodeJS. First, we configure the app to use the OpenTelemetry trace exporter: JavaScript const sdk = new NodeSDK({ resource: new Resource({[SemanticResourceAttributes.SERVICE_NAME]: 'analytics'}), traceExporter: new OTLPTraceExporter({ url: `${collectorUri}/v1/traces` }) }) sdk.start() The next step is to read the metadata, recreate the context from the traceparent, and create a span. JavaScript client.on('message', (aTopic, payload, packet) => { if (aTopic === topic) { console.log('Received new message') const data = JSON.parse(payload.toString()) const userProperties = {} if (packet.properties['userProperties']) { //1 const props = packet.properties['userProperties'] for (const key of Object.keys(props)) { userProperties[key] = props[key] } } const activeContext = propagation.extract(context.active(), userProperties) //2 const tracer = trace.getTracer('analytics') const span = tracer.startSpan( //3 'Read message', {attributes: {path: data['path'], clientIp: data['clientIp']}, activeContext, ) span.end() //4 } }) Read the metadata Recreate the context from the traceparent Create the span End the span For the record, I tried to migrate to TypeScript, but when I did, I didn't receive the message. Help or hints are very welcome! Apache APISIX for Messaging Though it's not common knowledge, Apache APISIX can proxy HTTP calls as well as UDP and TCP messages. It only offers a few plugins at the moment, but it will add more in the future. An OpenTelemetry one will surely be part of it. In the meantime, let's prepare for it. The first step is to configure Apache APISIX to allow both HTTP and TCP: YAML apisix: proxy_mode: http&stream #1 stream_proxy: tcp: - addr: 9100 #2 tls: false Configure APISIX for both modes Set the TCP port The next step is to configure TCP routing: YAML upstreams: - id: 4 nodes: "mosquitto:1883": 1 #1 stream_routes: #2 - id: 1 upstream_id: 4 plugins: mqtt-proxy: #3 protocol_name: MQTT protocol_level: 5 #4 Define the MQTT queue as the upstream Define the "streaming" route. APISIX defines everything that's not HTTP as streaming Use the MQTT proxy. Note APISIX offers a Kafka-based one Address the MQTT version. For version above 3, it should be 5 Finally, we can replace the MQTT URLs in the Docker Compose file with APISIX URLs. Conclusion I've described several items I added to improve my OpenTelemetry demo in this post. While most are indeed related to OpenTelemetry, some of them aren't. I may add another component in another different stack, a front-end. The complete source code for this post can be found on GitHub.
In our workplace, we took on a challenge as front-end developers to delve into Rust and explore how we can create web applications. The initial step was familiarizing ourselves with the language’s fundamentals by studying the documentation. Upon initiating my study of Rust, I recognized similarities between Rust and JS/TS, drawing these was important for me to facilitate a more intuitive understanding. I wanted to share my learning path, writing this article outlining my exploration of Rust from the perspective of a front-end developer. The Rust programming language was originally developed by Mozilla for Firefox, and it is also used in major companies such as Facebook, Apple, Amazon, Microsoft, and Google. Notable projects like Dropbox, npm, GitHub, and Deno leverage Rust too. Also, the compiler of Next.js “Turbopack” has contributed to a remarkable 94.7% increase in speed in Next.js version 14. Why the Rust programming language is gaining popularity can be attributed to several key factors. Firstly, Rust is a compiled language that generates efficient machine code. This characteristic ensures that applications developed with Rust deliver exceptional performance. Moreover, Rust is highly reliable due to its compiler, which effectively prevents undefined behavior that might otherwise result in unexpected outcomes or crashes. Another one is its memory efficiency. While many languages either manage memory automatically, like JavaScript’s garbage collector, or have complete control over memory management, as in C or C++, Rust introduces a unique approach called the ownership model. (We will get back to this topic later.) In the upcoming sections of this article, we will explore essential topics of the Rust programming language from the perspective of a front-end developer. These topics include data types, variables, mutability, functions, tuples, arrays, structs, references, and borrowing. Rust Programming Language Data Types The contrast between JavaScript and the Rust programming language primarily manifests in their approach to data types. JavaScript adopts a dynamic typing system, while Rust employs static typing. In Rust, it is a must to determine the types of all variables at compile time, a characteristic that aligns more closely with TypeScript. Every value in Rust is associated with a specific data type, and these types are categorized into two main groups: Scalar and Compound types. In contrast, JS/TS has a small set of data types such as numbers, strings, booleans, and objects. Scalar types in Rust include integers (both signed and unsigned), floating-point numbers, booleans, and characters, while compound types comprise tuples and arrays. Integers A notable distinction in data types is that, unlike JS/TS, Rust provides size-specific choices for integers and floating-point numbers. This allows you to precisely regulate the amount of memory allocated for each type. Consequently, Rust stands out for its memory efficiency and high performance. Floating Numbers Rust offers two floating point types: f32 and f64, with sizes of 32 bits and 64 bits. The default type is f64 because it provides a similar speed to f32, offering more precision. It’s important to note that all floating-point types in Rust are signed. let x = 5.8; Booleans In Rust, like in JS/TS, the Boolean type has two potential values: true and false. They are one byte in size, and it is denoted by bool. let isRustReliable = true; let isRustReliable: bool = true; Characters Rust’s char type is four bytes in size. It specifically represents a Unicode Scalar Value, allowing it to encompass a broader range of characters beyond ASCII. This includes accented letters, different alphabet characters, emojis, and zero-width spaces, making Rust’s char type more versatile in handling diverse character sets. let char_type = 'a' let char_type: char = 'A'; Tuples A tuple groups together a variety of types into one compound type. Tuples come with a predetermined length, and once declared, they are unable to expand or delete. let origin: (i8, u8, f64) = (-5, 2, 2.2) let (a,b,c) = origin; // destructuring in tuple let firstElement = origin.0; // indexing Arrays Arrays, in contrast to tuples, require each of their elements to share the same type. Unlike arrays in JS/TS, Rust arrays have a fixed length, making it impossible to add or remove elements directly. If dynamic resizing is needed, similar to arrays in JS/TS, Vectors in Rust would likely be the suitable alternative. let origin: [i8, 3] = [1, 2, 3]; let origin = [4; 3] // means [4,4,4] let first = origin[0]; Variables and Mutability In Rust, the default behavior for variables is immutability, meaning that their values cannot be changed once assigned. The let keyword is used to declare variables, and if you want a mutable variable, you need to explicitly use mut after let. // Immutable variable let x = 5; // Mutable variable let mut y = 10; y = 15; // Valid because y is mutable There are also constants too, and one notable distinction between let is that you cannot use mut with constants. Constants, unlike variables, are inherently immutable. They are declared using the const keyword, and their type must be explicitly annotated. const MULTIPLIER: u32 = 5; Constants can be declared in any scope, including the global scope, making them valuable for values shared across different parts of the code. Functions In the Rust programming language, function bodies consist of a series of statements and optionally end in an expression. Statements are instructions that perform actions but do not return a value, while expressions evaluate to a resultant value. fn main() { let y = { let x = 3; // statement x + 1 // expression, which evaluates the value assigned to y }; println!("The value of y is: {y}"); } In JavaScript, you can create functions using either function declarations or expressions. In Rust, you can use function declarations or lambda functions, known as closures, each with its own syntax and distinctive features. JS // Function Declaration function add(a, b) { return a + b; } // Function Expression const subtract = function(a, b) { return a - b; }; Rust // Function Declaration fn add(a: i32, b: i32) -> i32 { a + b } // Closure (Lambda Function) let subtract = |a: i32, b: i32| -> i32 { a - b }; Ownership, References, and Borrowing Ownership is a fundamental concept in the Rust programming language that establishes a set of rules governing how the language manages memory throughout the program’s execution. In JavaScript, memory management is typically handled by a garbage collector, which automatically reclaims memory that is no longer in use, relieving developers of explicit memory management responsibilities. Unlike Rust, JavaScript’s abstraction of memory details makes it well-suited for high-level development but sacrifices fine-grained control over memory allocation and deallocation. The stack, organized as a first-in, first-out (FIFO) fixed-size structure, contrasts with the heap, an unbounded and less organized memory space with an unknown size at compile time. Rust’s memory allocator dynamically locates an available space in the heap, designates it as in use, and returns a pointer representing the address of that location. Key ownership rules in Rust include that each value can have only one owner, the value is dropped from memory when the owner goes out of scope, and there can be only one owner at any given time. In Rust, the automatic memory management is facilitated by the drop function, which is called when a variable goes out of scope. rust { // `s` is not valid here; it’s not yet declared let s = "hello"; // `s` is valid from this point forward // do stuff with `s` } // this scope is now over, and `s` is no longer valid // this scope is now over, and s is no longer valid Rust uses references, denoted by the & symbol, allowing you to refer to a value without taking ownership of it. References ensure that the data they point to remains valid for the duration of the reference’s lifetime. let original = String::from("hello"); let reference = &original; // Rust’s references are immutable by default, meaning they cannot modify the data they point to. However, mutable references, denoted by &mut, allow modifications to the referenced data. let mut value = 42; let reference = &mut value; // Mutable reference to `value` // Modify `value` through the mutable reference *reference = 10; A significant restriction on mutable references is that if you have one, you cannot have any other references—mutable or immutable—to the same value. let mut s = String::from("hello"); let r1 = &mut s; let r2 = &mut s; // ERROR: Cannot have multiple mutable references to `s` This restriction prevents data races at compile time. Data races occur when multiple pointers access the same data concurrently; at least one of them modifies the data, and there’s no synchronization mechanism. // In Rust, you cannot mix mutable and immutable references. let s = String::from("hello"); let r1 = &s; // No problem let r2 = &s; // No problem let r3 = &mut s; // BIG PROBLEM: Cannot have a mutable reference alongside immutable references The Rust programming language enforces the rule that you can either have one mutable reference or any number of immutable references to a value—this is the principle of exclusive mutable or shared immutable references.
This article is based on this article that describes the AIDocumentLibraryChat project with a RAG-based search service based on the Open AI Embedding/GPT model services. The AIDocumentLibraryChat project has been extended to have the option to use local AI models with the help of Ollama. That has the advantage that the documents never leave the local servers. That is a solution in case it is prohibited to transfer the documents to an external service. Architecture With Ollama, the AI model can run on a local server. That changes the architecture to look like this: The architecture can deploy all needed systems in a local deployment environment that can be controlled by the local organization. An example would be to deploy the AIDocumentLibraryChat application, the PostgreSQL DB, and the Ollama-based AI Model in a local Kubernetes cluster and to provide user access to the AIDocumentLibraryChat with an ingress. With this architecture, only the results are provided by the AIDocumentLibraryChat application and can be accessed by external parties. The system architecture has the UI for the user and the application logic in the AIDocumentLibraryChat application. The application uses Spring AI with the ONNX library functions to create the embeddings of the documents. The embeddings and documents are stored with JDBC in the PostgreSQL database with the vector extension. To create the answers based on the documents/paragraphs content, the Ollama-based model is called with REST. The AIDocumentLibraryChat application, the Postgresql DB, and the Ollama-based model can be packaged in a Docker image and deployed in a Kubernetes cluster. That makes the system independent of external systems. The Ollama models support the needed GPU acceleration on the server. The shell commands to use the Ollama Docker image are in the runOllama.sh file. The shell commands to use the Postgresql DB Docker image with vector extensions are in the runPostgresql.sh file. Building the Application for Ollama The Gradle build of the application has been updated to switch off OpenAI support and switch on Ollama support with the useOllama property: Kotlin plugins { id 'java' id 'org.springframework.boot' version '3.2.1' id 'io.spring.dependency-management' version '1.1.4' } group = 'ch.xxx' version = '0.0.1-SNAPSHOT' java { sourceCompatibility = '21' } repositories { mavenCentral() maven { url "https://repo.spring.io/snapshot" } } dependencies { implementation 'org.springframework.boot:spring-boot-starter-actuator' implementation 'org.springframework.boot:spring-boot-starter-data-jpa' implementation 'org.springframework.boot:spring-boot-starter-security' implementation 'org.springframework.boot:spring-boot-starter-web' implementation 'org.springframework.ai:spring-ai-tika-document-reader: 0.8.0-SNAPSHOT' implementation 'org.liquibase:liquibase-core' implementation 'net.javacrumbs.shedlock:shedlock-spring:5.2.0' implementation 'net.javacrumbs.shedlock: shedlock-provider-jdbc-template:5.2.0' implementation 'org.springframework.ai: spring-ai-pgvector-store-spring-boot-starter:0.8.0-SNAPSHOT' implementation 'org.springframework.ai: spring-ai-transformers-spring-boot-starter:0.8.0-SNAPSHOT' testImplementation 'org.springframework.boot:spring-boot-starter-test' testImplementation 'org.springframework.security:spring-security-test' testImplementation 'com.tngtech.archunit:archunit-junit5:1.1.0' testRuntimeOnly 'org.junit.platform:junit-platform-launcher' if(project.hasProperty('useOllama')) { implementation 'org.springframework.ai: spring-ai-ollama-spring-boot-starter:0.8.0-SNAPSHOT' } else { implementation 'org.springframework.ai: spring-ai-openai-spring-boot-starter:0.8.0-SNAPSHOT' } } bootJar { archiveFileName = 'aidocumentlibrarychat.jar' } tasks.named('test') { useJUnitPlatform() } The Gradle build adds the Ollama Spring Starter and the Embedding library with 'if(project.hasProperty('useOllama))' statement, and otherwise, it adds the OpenAI Spring Starter. Database Setup The application needs to be started with the Spring Profile 'ollama' to switch on the features needed for Ollama support. The database setup needs a different embedding vector type that is changed with the application-ollama.properties file: Properties files ... spring.liquibase.change-log=classpath:/dbchangelog/db.changelog-master-ollama.xml ... The spring.liquibase.change-log property sets the Liquibase script that includes the Ollama initialization. That script includes the db.changelog-1-ollama.xml script with the initialization: XML <databaseChangeLog xmlns="http://www.liquibase.org/xml/ns/dbchangelog" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.liquibase.org/xml/ns/dbchangelog http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-3.8.xsd"> <changeSet id="8" author="angular2guy"> <modifyDataType tableName="vector_store" columnName="embedding" newDataType="vector(384)"/> </changeSet> </databaseChangeLog> The script changes the column type of the embedding column to vector(384) to support the format that is created by the Spring AI ONNX Embedding library. Add Ollama Support to the Application To support Ollama-based models, the application-ollama.properties file has been added: Properties files spring.ai.ollama.base-url=${OLLAMA-BASE-URL:http://localhost:11434} spring.ai.ollama.model=stable-beluga:13b spring.liquibase.change-log=classpath:/dbchangelog/db.changelog-master-ollama.xml document-token-limit=150 The spring.ai.ollama.base-url property sets the URL to access the Ollama model. The spring.ai.ollama.model sets the name of the model that is run in Ollama. The document-token-limit sets the amount of tokens that the model gets as context from the document/paragraph. The DocumentService has new features to support the Ollama models: Java private final String systemPrompt = "You're assisting with questions about documents in a catalog.\n" + "Use the information from the DOCUMENTS section to provide accurate answers.\n" + "If unsure, simply state that you don't know.\n" + "\n" + "DOCUMENTS:\n" + "{documents}"; private final String ollamaPrompt = "You're assisting with questions about documents in a catalog.\n" + "Use the information from the DOCUMENTS section to provide accurate answers.\n" + "If unsure, simply state that you don't know.\n \n" + " {prompt} \n \n" + "DOCUMENTS:\n" + "{documents}"; @Value("${embedding-token-limit:1000}") private Integer embeddingTokenLimit; @Value("${document-token-limit:1000}") private Integer documentTokenLimit; @Value("${spring.profiles.active:}") private String activeProfile; Ollama supports only system prompts that require a new prompt that includes the user prompt in the {prompt} placeholder. The embeddingTokenLimit and the documentTokenLimit are now set in the application properties and can be adjusted for the different profiles. The activeProfile property gets the space-separated list of the profiles the application was started with. Java public Long storeDocument(Document document) { ... var aiDocuments = tikaDocuments.stream() .flatMap(myDocument1 -> this.splitStringToTokenLimit( myDocument1.getContent(), embeddingTokenLimit).stream() .map(myStr -> new TikaDocumentAndContent(myDocument1, myStr))) .map(myTikaRecord -> new org.springframework.ai.document.Document( myTikaRecord.content(), myTikaRecord.document().getMetadata())) .peek(myDocument1 -> myDocument1.getMetadata().put(ID, myDocument.getId().toString())) .peek(myDocument1 -> myDocument1.getMetadata() .put(MetaData.DATATYPE, MetaData.DataType.DOCUMENT.toString())) .toList(); ... } public AiResult queryDocuments(SearchDto searchDto) { ... Message systemMessage = switch (searchDto.getSearchType()) { case SearchDto.SearchType.DOCUMENT -> this.getSystemMessage(documentChunks, this.documentTokenLimit, searchDto.getSearchString()); case SearchDto.SearchType.PARAGRAPH -> this.getSystemMessage(mostSimilar.stream().toList(), this.documentTokenLimit, searchDto.getSearchString()); ... }; private Message getSystemMessage( String documentStr = this.cutStringToTokenLimit( similarDocuments.stream().map(entry -> entry.getContent()) .filter(myStr -> myStr != null && !myStr.isBlank()) .collect(Collectors.joining("\n")), tokenLimit); SystemPromptTemplate systemPromptTemplate = this.activeProfile .contains("ollama") ? new SystemPromptTemplate(this.ollamaPrompt) : new SystemPromptTemplate(this.systemPrompt); Message systemMessage = systemPromptTemplate.createMessage( Map.of("documents", documentStr, "prompt", prompt)); return systemMessage; } The storeDocument(...) method now uses the embeddingTokenLimit of the properties file to limit the text chunk to create the embedding. The queryDocument(...) method now uses the documentTokenLimit of the properties file to limit the text chunk provided to the model for the generation. The systemPromptTemplate checks the activeProfile property for the ollama profile and creates the SystemPromptTemplate that includes the question. The createMessage(...) method creates the AI Message and replaces the documents and prompt placeholders in the prompt string. Conclusion Spring AI works very well with Ollama. The model used in the Ollama Docker container was stable-beluga:13b. The only difference in the implementation was the changed dependencies and the missing user prompt for the Llama models, but that is a small fix. Spring AI enables very similar implementations for external AI services like OpenAI and local AI services like Ollama-based models. That decouples the Java code from the AI model interfaces very well. The performance of the Ollama models required a decrease of the document-token-limit from 2000 for OpenAI to 150 for Ollama without GPU acceleration. The quality of the AI Model answers has decreased accordingly. To run an Ollama model with parameters that will result in better quality with acceptable response times of the answers, a server with GPU acceleration is required. For commercial/production use a model with an appropriate license is required. That is not the case for the beluga models: the falcon:40b model could be used.
Sameer Shukla
Sr. Software Engineer,
Leading Financial Institution
Kai Wähner
Technology Evangelist,
Confluent
Alvin Lee
Founder,
Out of the Box Development, LLC