Gowhere – Read CSV file

Continue my Gowhere journey. I need to read data from a CSV file—a common task in programming, dealing with file.

A sample content with 3 columns ID, Effort, and Title.

ID,Effort,Title
"94503","4","Bug: The line was not aligned correctly"
"97018","12","Implement a cool feature"
"97595","1","Document an incident"

The file size is small so I do not have to worry too much about the performance at this point. Let’s see what requires to perform the task in Go.

In general, here are steps to read and parse a CSV file

  1. Open the file
  2. Read everything or line by line
  3. Parse the text into desired outcome format
  4. Close the file. When to close depends on how you read the file

Open a file

Go supplies a built-in package "os" to work with the file system. Opening a file will return a pointer to the file—if no error, otherwise an error, a normal pattern in Go.

There are 3 required steps

  1. Open the file by os.Open(file)
  2. Check for error before usage
  3. Close the file by using a defer function. This is important otherwise the file is locked
    // Need the os package to read open a file
    // f is a pointer to File object (*File)
    f, err := os.Open(file)
    // Check for error before usage
    if err != nil {
        writeLog(err)
        return nil
    }
    // Close file by a defer function
    defer func() {
        er := f.Close()
        if er != nil {
            writeLog(er)
        }
    }()

Read file

Depending on the file size and the type of applications, developers can choose either read all contents at once or line by line. I always prefer line by line.

    // This is a cool concept. Given that we have a pointer to the file opened by the os.
    // A scanner is created and scan line by line
    b := bufio.NewScanner(f)

    stats := make(map[string]int64)
    for b.Scan() {
        line := b.Text()
        parts := strings.Split(line, ",")
        effortText := strings.Trim(parts[1], "\"")
        if effortText == "" {
            continue
        }

        effort, err := strconv.ParseInt(effortText, 0, 0)
        if err != nil {
            writeLog(err)
            continue
        }
        id := strings.Trim(parts[0], "\"")
        stats[id] = effort
        fmt.Printf("%s - %d\n", id, effort)
    }

Go supplies bufio package to help manipulating IO (file). The first time I heard about the Scanner concept. It hits my mind immediately, kind of "wow that makes sense and cool".

After holding a point to a Scanner:

  1. Call Scan() to loop throw the buffer data
  2. Call Text() to access the current line. If the file is opened in binary mode, use a different method
  3. Parse the line to meet your logic

Proceed data

For common operations on a string, strings package is there. For conversion from a string to various data types—int, float, …—strconv package is there. They are all built-in packages.

Close the file

Handle by the defer function, which is called when the caller function exists.

    // Close file by a defer function
    defer func() {
        er := f.Close()
        if er != nil {
            writeLog(er)
        }
    }()

Gowhere – array, hash table, regular expression, and foreach loop

Gowhere? Some steps further—append arrays, hash table, regular expression, and for each loop.

After playing around with HTTP, I got the JSON data from the API. I wanted to analyze the data—to display the total hours spent on each work item.

A typical record has this structure

type WorkItem struct {
    FromDateTime time.Time
    ToDateTime   time.Time
    Detail       string
}

And an actual work item looks like

{
    "FromDateTime" : "2019-11-29 02:05:00 +0000 UTC",
    "ToDateTime": "2019-11-29 03:19:00 +0000 UTC",
    "Detail" : "Work Item 12345: Implement a cool feature"
}

The "Work Item 12345" has many records. The detail field are not the same except they contain the number 12345 as the work item ID. I want to display the sum of time spent for Work Item 12345. So the algorithm is pretty simple

  1. For each record, extract the work item id from the detail field
  2. Calculate the difference in hours between FromDateTime and ToDateTime
  3. Sum the difference with the existing value—if not exist, create a new one with the time spent zero

Note: If I am writing in C#, I can finish the implementation quickly with Linq support.

The expected result of a work item is below

{
    "WorkItemId" : "12345",
    "WorkItemName" : "Work Item 12345: Implement a cool feature",
    "TimeSpent" : time_spent_in_hours
}

Stats structure to hold the analysis result of a work item

// Stats ...
type Stats struct {
    WorkItemId    string
    WorkItemName  string
    TimeSpent float64
}

Let’s write some code and explore

package main

import (
    "encoding/json"
    "fmt"
    "net/http"
    "regexp"
    "time"
)

func main() {
    // Assuming that I have a list of IDs for a team.
    // The getWorkItemRecords will an array of WorkItem of a member.
    var records = make([]WorkItem, 0)
    for _, id := range team {
        r := getWorkItemRecords(id)
        // Discussion about appending 2 slices: https://stackoverflow.com/questions/16248241/concatenate-two-slices-in-go

        // This is how to append 2 arrays
        records = append(records, r...)
    }

    // A dictionary (hash table) with
    // key: WorkItemID (or name if cannot find the ID) - simply a string
    // value: total time spent
    // More detail about map here: https://blog.golang.org/go-maps-in-action
    statsMap := make(map[string]Stats)

    // Regular expression to extract ID (all numeric characters)
    workItemIdExp := regexp.MustCompile("[\\d]+")

    var id string

    for _, r := range records {
        timeSpent := r.ToDateTime.Sub(r.FromDateTime).Hours()
        if timeSpent < 0 {
            // The record does not have an end time
            continue
        }

        id = workItemIdExp.FindString(r.Detail)

        if id == "" {
            id = r.Detail
        }

        ts, exist := statsMap[id]

        if !exist {
            ts = Stats{id, r.Detail, 0}
        }
        ts.TimeSpent += timeSpent
        statsMap[id] = ts
    }

    var workingHours float64 = 0
    for key, value := range statsMap {
        workingHours += value.TimeSpent
        fmt.Printf("%s (%s) %f\n", key, value.WorkItemName, value.TimeSpent)
    }

    fmt.Printf("Working hours: %f\n", workingHours)
}

What are my further steps from this exercise?

  1. Append 2 arrays (slices) with "…" syntax—append(records, r...)
  2. Hash table (dictionary like) with mapmap[string]int means a dictionary with key is a string and value is an integer
  3. Regular expression with regexp package—regexp.MustCompile("[\\d]+")
  4. For each loop with the rangefor _, r := range records

A happy weekend!

Gowhere – http

I Go (went) a step further—http and other things to consume an API service. What would it take to call an API which returns a list of records in JSON? In .NET, it takes a few lines of code.

Scenario: Display a list of employees—Id, First Name, Last Name, and Joined Date—from a protected API—of a company. The returned value might contain more fields than necessary.

  1. Id: integer
  2. First Name and Last Name: string
  3. Joined Date: Date Time

First thing first, create a file http.go and write some code. To work with http, Go supplies the net/http package.

package main

import (
    "fmt"
    "net/http"
)

func main() {
    fmt.Println("Connecting to the API ...")
    const url = "https://xxxcompany.com/api/employees"

    const accessToken = "base64 access token"

    fmt.Println("Base Address: ", url)
    fmt.Println("Access Token: ", accessToken)

    // Issue a default request but will not work because of the missing access token
    resp, err := http.Get(url)
    if err != nil {
        fmt.Println("Cannot connect the API: ", err)
        return
    }

    // Close the body at the end of the execution
    defer resp.Body.Close()
}

Nothing’s fancy! I took them from the Go http package. When invoking a HTTP call, Go returns a response with an error if there is a connection problem. Go suggests that we must always check for error before usage—a good practice.

The above code will return a 401 status code—Not Authorized. I need to attach the access token to the request. To manipulate the request, I need to create it by myself and ask Go to send it. It is quite easy.

    // Create a custom request with custom headers
    resq, err := http.NewRequest(http.MethodGet, url, nil)
    resq.Header.Add("x-access-token", accessToken)
    // Send the request using the default client supplied by the http
    resp, err := http.DefaultClient.Do(resq)

    if err != nil {
        fmt.Println("Cannot connect the API: ", err)
        return
    }

What did I get from resp.Body? A binary stream.

binary, error:= ioutil.RealAll(resp.Body)

But I need a list of employees which is in JSON format. Go gives me the encoding/json package to decode from binary, represented data in JSON format, to object—struct in Go. So I define an Employee struct—custom data type—to hold the result. The Employee struct has JoinedDate which is a date time—time package is supplied to deal with time.

import (
    "encoding/json"
    "fmt"
    "net/http"
    "time"
)

// Existing code

// Employee
type Employee struct {
    Id         int
    FirstName  string
    LastName   string
    JoinedDate time.Time
}

And it’s time for gardening—decode the binary stream into list of employees

    // Create a decoder with passing the io reader from resp.Body
    decoder := json.NewDecoder(resp.Body)
    // Prepare an empty array of employees
    employees := make([]Employee, 0)
    // Decode, pass the pointer to the employees
    decoder.Decode(&employees)
    // Print the result
    fmt.Println(employees)

Put them all together, I have a working program. Run go run http.go and feel good.

package main

import (
    "encoding/json"
    "fmt"
    "net/http"
    "time"
)

func main() {
    fmt.Println("Connecting to the API ...")
    const url = "https://xxxcompany.com/api/employees"

    const accessToken = "base64 access token"

    fmt.Println("Base Address: ", url)
    fmt.Println("Access Token: ", accessToken)

    // Create a custom request with custom headers
    resq, err := http.NewRequest(http.MethodGet, url, nil)
    resq.Header.Add("x-access-token", accessToken)
    // Send the request using the default client supplied by the http
    resp, err := http.DefaultClient.Do(resq)

    if err != nil {
        fmt.Println("Cannot connect the API: ", err)
        return
    }

    // Close the body at the end of the execution
    defer resp.Body.Close()

    // Create a decoder with passing the io reader from resp.Body
    decoder := json.NewDecoder(resp.Body)
    // Prepare an empty array of employees
    employees := make([]Employee, 0)
    // Decode, pass the pointer to the employees
    decoder.Decode(&employees)
    // Print the result
    fmt.Println(employees)
}

// Employee
type Employee struct {
    Id         int
    FirstName  string
    LastName   string
    JoinedDate time.Time
}

Go where? One step further.

  1. 3 new packages: net/http, encoding/json, time
  2. Create custom http requests
  3. Define a new type via struct—class in C#
  4. Decode from binary stream—JSON data—to an array of object
  5. Use make method to create an object from a type

It makes my day, especially for the weekend!

Gowhere

I started some Go code. "Go where?", I wondered. For some mysterious reasons, I liked the term Gowhere. Let’s create a repository on GitHub to play with Go (Golang)—Gowhere.

Heading to Golang official site, following the instruction, I started to write code in no time. It is really fast and easy to get started. Let’s start with the infamous "Hello World!" application.

Installing Go is easy and straightforward

  1. Download the installation package from the official website, choose the one for your OS. I installed for both MacOS and Windows
  2. Run the package and follow the instruction—Next and Next and Done
  3. Open the command line, run this command go version

I use VS Code, highly recommended, as my code editor. One I created my first Go file and opened with VS Code, it suggests all the plugins I need to be productive with Go. Nice!

So far, I’ve known Go in a single file—write code in a file, no executable package. To run a Go file, I ran go run filename.go in the terminal—either inside the VS Code or Command Line or PowerShell or MacOs terminal.

Create hello.go and try it out

package main

import (
    "fmt"
)

func main() {
    fmt.Println("Hello World! Greeting from Go")
}

Neat and straightforward. There are 3 parts—package, import, and func.

Package – define the namespace (in C#) or a higher logical abstraction for other functions, methods. Which means that every Go file requires a package definition.

import – import functions from other packages.

func – Define a function. In the above example, main is the entry point function.

What if I changed the names—package name to main1 or function name to main_v1? – Failed. The compose main_main is strictly required.

The bracket ({) must be in the same line with the function name. This code will cause compile error

func main() 
{
    fmt.Println("Hello World! Greeting from Go")
}

All those errors are caught at the compile time—Go is a statical language. With the help from VS Code, I do not have to remember or worry about them. I will remember them when I write more Go code.

Go has a set of data types—just like any other languages. What are new to me are the Defer and Pointer. I knew pointer back in the university. Since then I have not used them. Thank to the C#.

Defer allows developers to specify callback functions to be executed in Last In First Out (LIFO) order once the main function completes its execution. I’ve not understood what it is used for—maybe to clean up resources. But I think it is an important concept in Go. I will find out more when I step further.

"How do I pass the returned value from the host method to deferred methods?", I wondered. I figured one possible solution using pointer.

Pointer holds the address of a variable. Which allows the consumers access the latest value of a variable—I know your address, I will go there and get it. Sometimes, It is a dangerous thing, better use with care.

Defer functions are also functions. Therefore it is fine to write defer in defer as the below example.


func deferCallWorkLikeCallBack() {
    // Is it possible to pass the returned value from calling method to the deferred function?
    var number int = 1
    // This will pass 1 to the deferred method. So not a reference
    defer deferInDefer(number)
    // This will pass the latest number, which is 13, to the deferred function
    // Pass the address (pointer) of the number variable
    defer deferWithPointer(&number)
    fmt.Println("Hi I am a deferred call. You can think of me as a callback in your word")
    number = 13
}

func deferInDefer(number int) {
    fmt.Println("Hi you passed the number:", number)
}

func deferWithPointer(number *int) {
    // This allows the function to access the latest value passed from the calling function
    fmt.Println("Hi you passed the pointer to number:", *number)
}

If I pass a pointer to a deferred function, I should be able to access the latest value of a variable.

package main

import (
    "fmt"
)

func main() {
    defer deferCallWorkLikeCallBack()
    fmt.Println("Hello World! Greeting from Go")
}

Will produce this output

Hello World! Greeting from Go
Hi I am a deferred call. You can think of me as a callback in your word
Hi you passed the pointer to number: 13
Hi you passed the number: 1

That’s a good start. I might go somewhere.