Rust Essentials

My recall of what are essentials in Rust. It is a learning documentation. It is "recall and spaced repeation" approach.

Why

In C#, a language I have been coding for years, and of course, love it, a common error is NullReferenceException. It means your code holds a reference that points to "nothing", there is nothing in that location in the memory. In other word, it is invalid.

Memory is a limited resource and many programs try to get a share. Memory magement is crucial to any programming language. Unused memory must be returned to the operating system. In C#, there is Garbage Collector (GC). It has some overhead on your program. I am far from knowning the detail here.

So far, as I know, Rust was designed with those problems in mind. The language makes it impossible for developers to make those mistakes.

Heap and Stack

They are two structures of memory to store data. Stack is designed to store fixed-size values such as numerics, literals. Heap is designed to store unknown size data, known as Reference type.

When your code declares a reference type, what the code holds is a pointer stored on the Stack. The pointer knows where to look for the value on the Heap. And NullRefrenceException is when the point is on the Stack, but the location on the Heap has gone.

Heap and Stack are essential concepts for developers to understand programming languages.

Mutate or Immutable by default

Another common, nasty problem is the "race condition" – many paths (code) modify to the same location (data). Imagine that you have some money in your pocket. Someone else "access" your pocket and get some of it without your knowing. Surprise! It is a host of unexpected errors especially in the production.

Rust is designed with "immutable" by default. This code is valid in many programming languages except Rust

let name: String = String::from("Thai Anh Duc");

name.push_str("Oh No!");

Rust requires your "awareness". If you intend to modify something, say it explicitly with the mut keyword.

let mut name: String = String::from("Thai Anh Duc");

name.push_str("You Rock!");

Ownership

This concept was new to me. I have not thought of ownership at the programming language level. The "Ownership" concept appears to me when I do the domain design, data model.

A value always has one and only one owner in a scope. A scope is denoted by . Once the scope ends, everything inside the scope is dropped unless ownership is moved.

Ok, so who is the owner? and what does it own? Let’s look at this code

let name: String = String::from("Thai Anh Duc");

There is a variable name which is a String that has a literal value "Thai Anh Duc". In short, there are "variable" and "value".
"Variable" is the owner.
"Value" is owned by the variable (the owner).

let name: String = String::from("Thai Anh Duc");
// This works fine because name is the owner
println!("My name is {name}");

// owner is moved to
let it_is_mine: String = name;

// This code will not compile because what name owns was moved to it_is_mine.
// In other word, name points to nothing
println!("My name is {name}");
// This is ok
println!("My name is {it_is_mine}");

The ownership is applied for reference types. Value types are different. The value is copied. Each variable owns its own data. The below code works fine

// This is a literal of string, it is fixed-size. The value is stored in the stack
let name = "Thai Anh Duc";
// This works fine because name is the owner
println!("My name is {name}");

// Another copy of data is created and copied_name owns the new data
let copied_name = name;

// Both work fine
println!("My name is {name}");
println!("My name is {copied_name}");

Move

The transfer of ownership is a Move. The actual data is unchanged (the data on the heap).

Copy

Another copy of data on the stack is created for the new variable. Both variables operate on their own data. It is safe.

Drop

When a scope ends, Rust performs a "Drop" to release memory allocated in the scope. Notice that a reference type is clean up unless its ownship is moved to the consumer. It is done by returning a value.

fn main(){
    let name: String = String::from("Thai Anh Duc");

    let hi = move_ownership(name);

    println!("{hi}");
}

fn move_ownership(name: String) -> String{
    let say_hi : String = String::from("Hi: ") + &name;

    say_hi
}

Reference

In the previous example, the &name is used to access the location of the variable. It is the pointer to a location on the heap

Bottom lines

I recalled and documented what I have learned from Rust Understanding Ownership.

Rust Data Types

The official reference is here. It is a wonderful and well-written resource. As a learner, I want to write things down in my own languages or repeat what has written there.

Rust is a statically typed language. Just like Java, C#, … it must know the types of all variables at the compile time. Usually, the compiler does that by two means. One is that types are supplied explicitly. The other is by inferring from the value, the complier makes the best guess from a value.

I usually prefer the explicit type approach, especially numeric values.

// Explicit declaration
let age: i32 = 38;

// Implicit declaration. The compiler will figure out the type
let age = 38;

A few things from that simple statement

  • let: Rust keyword to define a variable
  • age: variable name
  • :: seperator between variable name and type
  • i32: variable type, in this case, it is a 32bit integer. Variable and type are seperated by a colon :
  • =: assignment, assign a value to the variable
  • 38: variable value

Shadow and Mutate

Shadow is the ability to define a new variable with the same name. It means that the "new" variable can be of a different data type.
Rust is immutable by default. To mutate a variable, it might be declared with mut keyword. It requires extra attention from the developers. You should know what you are trying to do.

fn shadow() {
    let x = 5;
    let x = x + 1; // which is 6

    {
        // Create an inner scope
        // shadow the x variable
        let x = x * 2;
        println!("The value of x in the inner scope is: {x}");
    }

    println!("The value of x in the outer scope is: {x}");
}

fn mutate_scope() {
    let mut x = 5;
    x = x + 1; // which is 6

    {
        // Create an inner scope
        // shadow the x variable
        x = x * 2;
        println!("The value of x in the inner scope is: {x}");
    }

    println!("The value of x in the outer scope is: {x}");
}

Tupe and Array

Tupe allows developers to construct a new structure (or data type) that holds different data types.
Array is a fixed size structure of the same data type.

fn data_types() {
    // Besides the normal types seen in many other languages, tupe and array are interesting to explore here
    let tup: (i32, &str) = (38, "Thai Anh Duc");
    println!("Hi, my name is {}, {} years old", tup.1, tup.0);

    // Deconstruct tup into individual variables
    let (mut age, mut name) = tup;
    println!("Hi, my name is {name}, {age} years old");

    age = 10;
    println!("Hi, my name is {name}, {age} years old");

    name = "TAD";
    println!("Hi, my name is {name}, {age} years old");

    // Array is fixed size
    let _months = ["January", "February", "March", "April"];
    // Auto generated values from a feed one
    let auto_values = [10;3];
    println!("{}", auto_values[2]);
}

With those basic data types, one can write an application.

Rust on Linux

Enter the new uncomfort zone with Linux and Rust. This post documents the process and will serve as a reference for later usage.

Setup a Linux box

Having another machine with Linux operation system is cool but not necessary. Welcome to the Windows Subsystem for Linux 2 (WSL 2).
I am on Windows 11, I follow the instruction from Microsoft Docs.
With the Terminal and Powershell installed, run the command

wsl

The terminal displays the help with possible commands. I want to understand them a bit before actually executing any command.

wsl --list --online

Displays all the distributions. The Ubuntu is the default if none is specified. Let’s use the default. Remember to run as Administrator

# Explicitly specify the distribution for clarity
wsl --install --distribution Ubuntu

Nice! Installed and rebooted.

Create a folder (dir) to store code

$ mkdir code

Visual Studio Code with WSL

I am following the document here.

  • Install Remote Development extension
  • Navigate to the Ubuntu terminal, type code .. Magic begins

Rust on Linux

The Rust documentation is rich. I follow its programming book

$ curl --proto '=https' --tlsv1.2 https://sh.rustup.rs -sSf | sh

After successfully installed Rust on Linux box, I need to restart the shell. It is to ensure that system understands the new environment variables. Otherwise, it does not understand the changes

# This will not recognize as a command
$ cargo
# Reset the profile/shell (bash shell)
$ source ~/.profile
# This will work
$ cargo

Write some code and struggle

fn main() {
    println!("Hello Rust 101");
}

I got the first error "linker ‘cc’ not found"
Ok. Get it the linker

$ sudo apt install build-essential

Enjoy the fruit

# Compile the code
$ rustc ./main.rs

# Run it
$ ./main

Summary

So what have I accomplished so far? I have setup a new development environment which includes

  • A Linux box running on Windows 11 using WSL 2
  • Visual Studio Code remote development. It allows me to stay on the Windows and write code in the Linux box. It is neat and straightforward. VS Code is amazing
  • Install and write a "hello word" rust application

What a great way to start a weekend!

Production Bug – Unnoticeable Async Void

The operation team reported a bug in production. The server crashed a few times, which seems daily. Here is the detail. I modified some texts (replaced with XXX) to remove the confidential data.

Your app crashed because of System.Net.Http.HttpRequestException
Your app, XXX, crashed because of System.Net.Http.HttpRequestException and aborted the requests it was processing when the overflow occurred. As a result, your app’s users may have experienced HTTP 502 errors.

This call stack caused the exception:
System.Net.Http.HttpContent+d__54.MoveNext
System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw
System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess
System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification
System.Net.Http.HttpContent+d__70`2[[System.__Canon System.Private.CoreLib][System.__Canon System.Private.CoreLib]].MoveNext
System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw
System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess
System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification
XXX.DeleteSampleCommandHandler+<>c__DisplayClass6_0+<b__0>d.MoveNext
System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw
System.Threading.Tasks.Task+<>c.b__139_1
System.Threading.QueueUserWorkItemCallback+<>c.<.cctor>b__6_0
System.Threading.ExecutionContext.RunForThreadPoolUnsafe[[System.__Canon System.Private.CoreLib]]
System.Threading.QueueUserWorkItemCallback.Execute
System.Threading.ThreadPoolWorkQueue.Dispatch
System.Threading._ThreadPoolWaitCallback.PerformWaitCallback
---> (Inner Exception #0) System.IO.IOException
System.Net.Security.SslStream+d__214`1[[System.Net.Security.SslStream+SslReadAsync System.Net.Security]].MoveNext
System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw
System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess
System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification
System.Net.Http.HttpConnection+d__87.MoveNext
System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw
System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess
System.Net.Http.HttpConnection+ChunkedEncodingReadStream+d__10.MoveNext
System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw
System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess
System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification
System.Net.Http.HttpConnectionResponseContent+d__5.MoveNext
System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw
System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess
System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification
System.Net.Http.HttpContent+d__54.MoveNext
--- End of inner exception stack trace ---
---> (Inner Exception #1) System.ObjectDisposedException
System.Net.Security.SslStream.ThrowIfExceptional
System.Net.Security.SslStream.CheckThrow
System.Net.Security.SslStream+d__214`1[[System.Net.Security.SslStream+SslReadAsync System.Net.Security]].MoveNext
--- End of inner exception stack trace ---

To learn more about this exception, use the Kudu console to view the log file in d:\home\logfiles\crashdumps, which contains up to 10 log files for your app if space is available.

A long stack trace without only one line related to our application:

XXX.DeleteSampleCommandHandler+<>c__DisplayClass6_0+<b__0>d.MoveNext 

Let’s check out that command handler class.

public async Task Handle(DeleteSampleCommand command, CancellationToken cancellationToken)
{
    var client = await _clientFactory.GetSampleApiClient(command.Endpoint);

    client.RequestProcessed += async (sender, response) =>
    {
        var request = await response.RequestMessage.Content.ReadAsStringAsync();
        var content = await response.Content.ReadAsStringAsync();

        await _auditRepository.Add(command.Endpoint, command.SampleId, _context.User, "DeleteSample", request, (int)response.StatusCode, content);
    };

    await client.DeleteSampleAsync(command.SampleId, cancellationToken);

    // Other logic after deleting the sample from a remote endpoint

}

A client is auto-generated by Swagger/NSwag. This is the DeleteSampleAsync method.

public async System.Threading.Tasks.Task DeleteSampleAsync(System.Guid sampleId, System.Threading.CancellationToken cancellationToken)
{
    if (sampleId == null)
        throw new System.ArgumentNullException("sampleId");

    var urlBuilder_ = new System.Text.StringBuilder();
    urlBuilder_.Append(BaseUrl != null ? BaseUrl.TrimEnd('/') : "").Append("/samples/{sampleId}");
    urlBuilder_.Replace("{sampleId}", System.Uri.EscapeDataString(ConvertToString(sampleId, System.Globalization.CultureInfo.InvariantCulture)));

    var client_ = _httpClient;
    var disposeClient_ = false;
    try
    {
        using (var request_ = new System.Net.Http.HttpRequestMessage())
        {
            request_.Method = new System.Net.Http.HttpMethod("DELETE");

            PrepareRequest(client_, request_, urlBuilder_);
            var url_ = urlBuilder_.ToString();
            request_.RequestUri = new System.Uri(url_, System.UriKind.RelativeOrAbsolute);
            PrepareRequest(client_, request_, url_);

            var response_ = await client_.SendAsync(request_, System.Net.Http.HttpCompletionOption.ResponseHeadersRead, cancellationToken).ConfigureAwait(false);
            var disposeResponse_ = true;
            try
            {
                var headers_ = System.Linq.Enumerable.ToDictionary(response_.Headers, h_ => h_.Key, h_ => h_.Value);
                if (response_.Content != null && response_.Content.Headers != null)
                {
                    foreach (var item_ in response_.Content.Headers)
                        headers_[item_.Key] = item_.Value;
                }

                ProcessResponse(client_, response_);

                var status_ = (int)response_.StatusCode;
                if (status_ == 200)
                {
                    return;
                }
                else
                if (status_ == 400)
                {
                    string responseText_ = ( response_.Content == null ) ? string.Empty : await response_.Content.ReadAsStringAsync().ConfigureAwait(false);
                    throw new ApiException("Bad Request", status_, responseText_, headers_, null);
                }
                else
                if (status_ == 401)
                {
                    string responseText_ = ( response_.Content == null ) ? string.Empty : await response_.Content.ReadAsStringAsync().ConfigureAwait(false);
                    throw new ApiException("Unauthorized", status_, responseText_, headers_, null);
                }
                else
                if (status_ == 403)
                {
                    string responseText_ = ( response_.Content == null ) ? string.Empty : await response_.Content.ReadAsStringAsync().ConfigureAwait(false);
                    throw new ApiException("Forbidden", status_, responseText_, headers_, null);
                }
                else
                if (status_ == 500)
                {
                    string responseText_ = ( response_.Content == null ) ? string.Empty : await response_.Content.ReadAsStringAsync().ConfigureAwait(false);
                    throw new ApiException("Server Error", status_, responseText_, headers_, null);
                }
                else
                {
                    var responseData_ = response_.Content == null ? null : await response_.Content.ReadAsStringAsync().ConfigureAwait(false); 
                    throw new ApiException("The HTTP status code of the response was not expected (" + status_ + ").", status_, responseData_, headers_, null);
                }
            }
            finally
            {
                if (disposeResponse_)
                    response_.Dispose();
            }
        }
    }
    finally
    {
        if (disposeClient_)
            client_.Dispose();
    }
}

The client.RequestProcessed is our custom event. The implementation looks like

// Extend the generated SampleClient, thank to the partial class
public partial class SampleClient
{
    // It is designed in the SampleClient to allow consumers hook up additional functionalities
    partial void ProcessResponse(HttpClient client, HttpResponseMessage response)
    {
        RequestProcessed?.Invoke(client, response);
    }

    // Our custom event which is invoked whenever the ProcessResponse is executed
    public event EventHandler<HttpResponseMessage> RequestProcessed;
}

Investigation

The application is a Web API service. Command handlers are invoked from Controllers. Usually, an exception cannot crash the process.

Unless?

I recall I read the async await practices from David Fowler.

BAD Async void methods can’t be tracked and therefore unhandled exceptions can result in application crashes.

Async void usually appears in event handlers. Which is this code

client.RequestProcessed += async (sender, response) =>
{
    var request = await response.RequestMessage.Content.ReadAsStringAsync();
    var content = await response.Content.ReadAsStringAsync();

    await _auditRepository.Add(command.Endpoint, command.SampleId, _context.User, "DeleteSample", request, (int)response.StatusCode, content);
};

The compiler will generate an async void delegate for that event handler. Remember that the ProcessResponse is synchronous. Therefore, when there is an exception in the event handler block, it crashes the application.

But why? how to explain the exception in the stack trace? The response instance was disposed of. The event handler read request and response contents on a different thread for auditing purposes.

Solution

A quick and dirty solution introduces another method that returns a task.


private async Task AuditRequest(DeleteSampleCommand command, HttpResponseMessage response)
{
    var request = await response.RequestMessage.Content.ReadAsStringAsync();
    var content = await response.Content.ReadAsStringAsync();

    await _auditRepository_.Add(command.Endpoint, command.SampleId, _context.User, "DeleteSample", request, (int)response.StatusCode, content);
}

client.RequestProcessed += (sender, response) =>
{
    _ = AuditRequest(command, response);
};

However, it will break the audit requirements. The _auditRepository_.Add does not run because of an exception when reading the response.

To audit, the code needs to know the request and response. The execution flow can get the request and response without reading HttpResponseMessage directly again. The generated Swagger client has already read and wrapped the content in the ApiException.Response property.

And the final solution is as below.

public async Task Handle(DeleteSampleCommand command, CancellationToken cancellationToken)
{
    var payload = JsonConvert.SerializeObject(command);
    var response = "OK";
    var statusCode = 200;
    var client = await _clientFactory.GetSampleApiClient(command.Endpoint);

    try
    {
        await client.DeleteSampleAsync(command.SampleId, cancellationToken);
    }
    catch (ApiException apiException)
    {
        response = apiException.Response ?? apiException.Message;
        statusCode = apiException.StatusCode;
    }

    await _auditRepository.Add(
        command.Endpoint,
        command.SampleId,
        _context.User,
        "DeleteSample",
        payload, 
        statusCode,
        response);
    // Other logic after deleting the sample from a remote endpoint

}

Conclusion

Here are a few things I learned from this production issue.

  1. Be careful with async await. It looks cool but be sure you know what you are doing.
  2. Avoid accessing the Request/Response directly. Look for another solution. There is always another solution.
  3. When introducing an event, ask this question: "Do I need it? Can I write code without it?"

Production bugs are always interesting.

Production Bug – SQL Identity Jump

I got a bug report that some records were not synced between the two systems. There is a system to input data, and there is another system to expose data to external systems. In between, there is a system to convert data. Combining them is a data pipeline. It is typical architecture these days.

After checking data in related databases, I found that the error occurred in a single database where the data moved from one table to another. There are a source table, a destination table, and an intermediate table. We store the progress in a state which allows the system to continue from the last run.

Whenever there are changes in the source table, a SQL trigger executes and moves the "EntityId" column into the intermediate table.
Whenever the function (a timer Azure Function) starts, it reads data from the intermediate table and performs the required logic to convert data.

Everything worked as expected until the production bug was reported.

Schema and Code

The intermediate table schema Entity_Changes

CREATE TABLE [dbo].[Entity_Changes]
(
    [Ordering] [bigint] IDENTITY(1,1) NOT NULL PRIMARY KEY, 
    [EntityId] UNIQUEIDENTIFIER NOT NULL, 
)

The execution flow is

  1. Read records from Entity_Changes table
  2. Proceed them
  3. Delete them from Entity_Changes table

The simplified version of the code

private async Task<long> ReplaceRows(IDataSource dataSource, ConversionState resultState, CancellationToken cancellationToken)
{
    var pagingFrom = resultState.NumberOfRowsConverted;

    long recordsCount = 0;
    const int batchSize = 1000;
    do
    {
        cancellationToken.ThrowIfCancellationRequested();

        pagingFrom = resultState.NumberOfRowsConverted;
        var pagingTo = pagingFrom + batchSize;

        var pagedResult = await dataSource.ReadData(pagingFrom, pagingTo);
        if (pagedResult.MaxOrdering == 0)
        {
            _logger.LogInformation("No records to proceed");
            break;
        }

        // Do the conversion logic here

        recordsCount += distinctResults.Count;
        _logger.LogInformation($"{recordsCount} records");

        // NumberOfRowsConverted is used to set the PagingFrom value. 
        // It is the max ordering from the last batch
        resultState.NumberOfRowsConverted = pagedResult.MaxOrdering;
        resultState.State = (int)ConversionStateStatus.Running;
        await _conversionStateService.SaveLastSyncedState(resultState);

        var deletedRows = await dataSource.DeleteProceededRecords(pagingFrom, pagingTo);
        _logger.LogInformation($"Delete {deletedRows} rows from temp table");

    } while (true);

    return recordsCount;
}

In one of the data sources, I found these interesting numbers. The resultState.NumberOfRowsConverted was 5.885.592. The lowest Ordering in the source table was 5.895.593. The difference was 10.001. It did not make sense. How could that happen?

With those numbers, the ReadData returned empty result because the pagingTo was 5.886.592 (=5.885.592 + 1.000). It is smaller than 5.895.593

From experience, I know that the SQL Identity column does not guarantee continuous values; for example, when a transaction fails, Identity values are lost. But 10K gap is too big.

Looking at the code for hours could not help me explain (btw, I found some other issues with the code); I asked Google this SQL Identity Jump. Bingo! I found the answer SQLHack Avoid Identity Jump.


SQL Server is using a different cache size for the various data type of identity columns. The cache size of the INT data type column is 1000, and the Cache size of the BIGINT or Numeric data typed column is 10000.

Bingo! In some circumstances, there are jumps. The Ordering column is a BIGINT, so the jump size is 10.000.

Solution

Once you know the root cause, the solution is quite simple. First off, the paging implementation was not accurate. The number of rows might not be the same between the reading and deleting data. How could it happen?

Let’s say, when reading data for paging from [0-1.000], there are 10 rows in the database. The system proceeds them. When it comes to deleting those 10 rows, there are 900 rows. The solution is simple.

// Delete the actual ordering range that were proceeded
var deletedRows = await dataSource.DeleteProceededRecords(pagingFrom, pagedResult.MaxOrdering);

To solve the first problem, query using SELECT TOP WHERE ORDERING > @pagingFrom instead of querying by range.

var pagedResult = await dataSource.ReadData(pagingFrom, batchSize);

Production bugs are always interesting.