Go 101 - HackerSpace PESUECC's Golang Workshop!
Written By Mentors, HackerSpace PESUECCNote : Setup and Pre-requisites are a must, the mentors will NOT be helping you install go during workshop.
Pre-requisites | Setup
For linux :
sudo rm -rf /usr/local/go && tar -C /usr/local -xzf go1.19.1.linux-amd64.tar.gz
export PATH=$PATH:/usr/local/go/bin
go version
If after the 3 commands above, if you get the 1.19.1 version as output, you are good to go!
If you are on Windows, download the Go Installer and run it.
If you are on macOS, download the Go Installer and run it.
Prelude: What's Programming?
Hint: It's more than writing code. Let's go through a few bits I've shamelessly copy-pasted from "The Structure and Interpretation of Computer Programs" -
An assault on large problems employs a succession of programs, most of which spring into existence en route.
To appreciate programming as an intellectual activity in its own right you must turn to computer programming; you must read and write computer programs— many of them. It doesn’t matter much what the programs are about or what applications they serve. What does matter is how well they perform and how smoothly they fit with other programs in the creation of still greater programs. We are about to study the idea of a computational process. Computational processes are abstract beings that inhabit computers. As they evolve, processes manipulate other abstract things called data. The evolution of a process is directed by a pattern of rules called a program. People create programs to direct processes. In effect, we conjure the spirits of the computer with our spells.
That's it. A computer is basically magic that runs what are known as programs - basically, spells - that we use to manipulate the electrons to do our bidding.
This happens through "programs", which are nothing but text in a specific format - to make it easier for the computer to understand - and it's through these "Spells" that we can use the processor to manipulate the electrons to do our bidding.
Programs to do the same thing can look as simple as this -
print("hello, world!")
or as complicated as this -
class HelloWorld {
public static void main(String[] args) {
System.out.println("Hello, World!");
}
}
Both of these programs just write "Hello, world!" to the display!
And here's "Hello, world" in Golang -
package main
import "fmt"
func main() {
fmt.Println("Hello, world!")
}
What is Go?
Go is a programming language created at Google in 2009. It was created to address a crisis - it wasn't easy enough to write applications that were fast and ran at scale.
The existing solutions to this problem - Java and C/C++ - are all languages that are relatively difficult to learn and write maintainable code in. These languages tend to be very verbose, and they're more often than not difficult to understand at a glance.
Legacy codebases, or old codebases written in these languages, tend to be extremely difficult to maintain.
In short, we can sum up the issues with these languages as follows -
- Lack of readability
- Lack of static typing, and
- Lack of easy high-performance networking and parallelism (taking advantage of all the power your computer has by doing more things at once).
The folks at Google, including Ken Thompson (one of the original contributors to the C Programming Language), decided that the best way to address this issue was not to monkey-patch an existing language, but instead to write a whole new language from scratch.
Arguably, this wasn't a minor undertaking, but it's a decision that paid dividends multiple times over, as Go is arguably the standard for any modern-day backend application that needs to run at scale, and with low latency.
Why Golang?
The reason people and programmers love Go, apart from the cute mascot (We love the Gopher), is that it's extremely easy compared to other languages that offer the features that Go offers.
The other reason, perhaps the more important one, is that Go makes it super easy to do things in Parallel. Perhaps it won't be apparent as to how easy it is right now, but doing things in Parallel is an endeavor that involves coordinating and working with the Hardware as well as the Operating System.
Go's concurrency model, or how it recommends you do things in parallel, makes it extremely easy for first-timers and experienced developers alike to write code that runs in parallel with ease.
This is because Golang makes use of these nifty things called goroutines
.
To put it in context, think about it this way - You have a cake that you need to get rid of. Now, you can either work through the entire cake on your own, or you can call a few friends, share the cake, and get to work in parallel on the same cake!
The latter approach allows you to process the entire job in a lot less time.
What do you think will happen when we get to a very large number of friends? Do you think it'll be over in zero time?
Goroutines are your friends. They're lightweight abstractions over the fundamental unit of concurrency in your computer - Threads - that allow you to do things in Parallel a lot, lot, lot easier. Navin and Mukund will explain the fundamentals of Goroutines, and how to use them, a little later!
Similarity to C
Go is pretty similar to C, save for a few notable exceptions.
Here's An Example of the code for the popular "FizzBuzz" Program in Go -
func fizzbuzz_stdout_sequential(start, end int) {
for i := start; i <= end; i++ {
if i%3 == 0 {
fmt.Print("fizz")
}
if i%5 == 0 {
fmt.Print("buzz")
}
if i%3 != 0 && i%5 != 0 {
fmt.Print(strconv.Itoa(i))
}
fmt.Println("")
}
}
And here it is in C -
int fizzbuzz(int start, int end) {
for (int i = 0; i <= end; i++) {
if (i %3 == 0) {
printf("fizz");
}
if (i %5 == 0) {
printf("buzz");
}
if (i % 3 != 0 && i % 5 != 0) {
printf("%d", i);
}
printf("\n");
}
}
Here, you can see how similar Go and C are.
The basic syntax
Let's go back to the Hello World program that we have in Golang. Get out your laptops, bust open your IDE, and let's get programming!
package main
import "fmt"
func main() {
// variable declaration with var
var greet string
greet = "Hello, world!"
// variable declaration with assignment
greet_1 := "Hello, world! 1"
// what is the type of `answer`?
answer := 1 + 2
// show both
fmt.Println(greet)
fmt.Println(greet_1)
fmt.Println(answer)
for i := 5; i > -1; i-- {
fmt.Println("Counting Down ..", i)
}
}
A few things to note -
- Much like C, every variable in Go has a
type
, which tells the computer how to store the variable internally. You can also use types to make sure your program is correct. - Using the
:=
allows the compiler to "infer", or derive, the type of data that you're passing, so you don't need to explicitly specify it. - Packages are bits of code written by other people so you can reuse them.
fmt
is a package that comes built in with Golang, much like many others.PrintLn
is a function is thefmt
package.
Understanding What go can do with FizzBuzz
Here's the same program written above, slightly modified to write to an array instead of the screen.
func fizzbuzz_sequential(start, end int) []string {
// declare the array we're using to keep the strings in
var fb []string
for i := start; i <= end; i++ {
// keep a variable to check if the number is divisible by 3 or 5
divisible := 0
// no else used here to ensure numbers divisbile by 3 and 5 show up as "fizzbuzz"
if i%3 == 0 {
fb = append(fb, "fizz")
divisible = 1
}
if i%5 == 0 {
fb = append(fb, "buzz")
divisible = 1
}
// if we haven't appended anything, push the number itself
if divisible == 0 {
fb = append(fb, strconv.Itoa(i))
}
fb = append(fb, "\n")
}
return fb
}
But we can do better. After adding parallelization, there's a noticeable speedup. You can find this code here, at https://github.com/anirudhRowjee/fizzbench
fizzbuzz-go on main [!] via 🐹 v1.18.1 took 17s
λ go run main.go
2022/10/07 10:21:10 FIZZBUZZ BENCHMARK REPORT
2022/10/07 10:21:10 GOROUTINES: 100 | START: 0 | END: 10000000
2022/10/07 10:21:10 SEQUENTIAL Execution took (AVG over 5 runs) 1.887160 sec
2022/10/07 10:21:10 PARALLEL Execution took (AVG over 5 runs) 0.853446 sec
2022/10/07 10:21:10 WORKER POOL Execution took (AVG over 5 runs) 0.721665 sec
2022/10/07 10:21:10 Average Speedup (Parallel vs Sequential) -> 2.211225x
2022/10/07 10:21:10 Average Speedup (Worker Pool vs Sequential) -> 2.615010x
2022/10/07 10:21:10 Average Speedup (Worker Pool vs Parallel ) -> 1.182607x
And mind you, this isn't the best we can do! It's possible to achieve far greater speedups with better architecture and design decisions. For now, let's head on over to the syntax.
03-More about syntax
Defer:
- Defer is an keyword which pushes or executes an operation only at the end of that function block or module.
- The very best example is closing a file after certain file operations:
package main
import (
"fmt"
"os"
)
func main() {
f := createFile("/tmp/defer.txt")
defer closeFile(f)
writeFile(f)
}
func createFile(p string) *os.File {
fmt.Println("creating")
f, err := os.Create(p)
if err != nil {
panic(err)
}
return f
}
func writeFile(f *os.File) {
fmt.Println("writing")
fmt.Fprintln(f, "data")
}
func closeFile(f *os.File) {
fmt.Println("closing")
err := f.Close()
if err != nil {
fmt.Fprintf(os.Stderr, "error: %v\n", err)
os.Exit(1)
}
}
- You can push any operation to the end of the execution cycle of a particular module using defer.
Struct:
- A structure or struct in Golang is a user-defined type that allows to group/combine items of possibly different types into a single type.
- Now for those with programming experience, this really looks similar to a Class in Object Oriented Programming however it is not.
- Classes enable inheritance however Struct enables composition.
- Let's now see an example,
- We have an operation which concerns with a student's univeristy record, we need to store 3 to 4 attributes and their respective values and access them in a uniform manner for further operations.
- Now, normally you would create a dictionary or if you are into OVERKILL, 3 to 4 variables per students which seems wasteful and non-uniform.
- Here we can use structs to make our lives easier as it helps us store and access such records with such simplicity and ease!
package main
import(
"fmt"
)
type Student struct{
firstName string
lastName string
age int
USN string
}
func printRecords(s Student){
fmt.Printf("First Name: %s\n Last Name: %s\n Age: %d\n USN: %s\n", s.firstName, s.lastName, s.age, s.USN)
}
func main(){
stud1 := Student{firstName: "Mukund",lastName: "Deepak",age: 20,USN: "PES2UG20CS989"}
printRecords(stud1)
}
04-Concurrency
What are threads, a very naive explaination
Before we get started with concurrency, lets quickly skim through some basics. Try to answer the below questions on your own first.
- What is a program?
- What is a process?
So, in layman terms
- Program is a written set of instructions or syntax with some functionality which will be executed by the computer.
- Process is a program in execution.
- As seen in a computer, multiple processes can execute simultaneously and well that's multiprocessing.
- But the overhead and the work put on creating a process in any OS is quite heavy.
- We use Threads and the concept of Multithreading to achieve what processes and multiprocessing does but with lower overhead and much much more simplicity.
- So, What exactly is a thread?
- A Thread is an independent sequence of execution within a process.
- Lets take an example to understand better :),
- Alice and Joe have the sunday to themselves without their chaotic kids ruining the weekend and they want to enjoy their weekend and go out on a date!
- Saturday afternoon, Alice sees that the house is a mess, so Alice needs to finish all her chores however it is not a task that can be done single-handedly. If done single-handedly, it would take her many hours. So, what can she do? She pulls up her husband and her 3 kids and divides all the chores between them to get them done parallely and well surprisingly gets them done faster and ahead of schedule and gets to enjoy a couple more hours than her Sunday.
- So now given this analogy, let us pick out certain instances and corelate them to concurrent execution and computing.
- Alice doing all the chores single-handedly - sequential execution within a single process.
- Alice dividing all the chores between her family and getting all chores done parallely faster - concurrent multithreaded execution
- Who are the key players a.k.a the threads - alice, joe and her three kids.
- And that's threads and concurrency simplified max folks!
Go routines
What is a goroutine?
- Basically its a non-blocking asynchronous concurrent function call in its simplest definition.
- Confused? Let me clear it up with an example.
- We'll create a routine and we will execute some iterations of this routine as normal function call and as goroutine calls and we will analyse this as we go.
package main
import(
"fmt"
"time"
)
func f(word string){
for i:=0;i<len(word);i++{
fmt.Printf("%c\n", word[i])
}
}
func main(){
f("hello")
go f("navin")
go f("anirudh")
time.Sleep(time.Second) //why are we doing this?
}
- So we have created a function f that takes the string and prints all the letters of the said word when f is called along with a passed string argument.
- The first call is a blocking normal function call we made within our main function.
- The second and the third call are done sequentially but the function execution is happening parallely in their own consective threads!
- So what would be the output of the following program?
h
e
l
l
o
a
n
i
r
u
d
h
n
a
v
i
n
- The reason why anirudh prints before navin is because the threaded executions are placed in a runqueue and since there is not much work for the scheduler to run the goroutines in the runqueue, it is executing it both parallely however the outputs are printed in fifo order of the queue. This is why the last goroutine call will be executed first and then the first goroutine call is executed!
- In the case of very large input strings, interleaving can be expected as the goroutines are concurrently executed by the go runtime.
- We use time.Sleep(time.Second) to wait for the goroutines to execute completely and not exit the main function before they finish execution.
- But is there a better way to do this? That's where WaitGroups come into picture!
- And how do these goroutines or threads to simplify communication or exchange of information with each other? That's where Channels come into picture!
Wait Groups
- We did find a temporary fix to get our go routines up and running i.e. the time.Sleep(time.Second) fix however as computer scientists, our instinct goes to speeding up the execution and 1 second to print two strings seems wasteful as these respective goroutines take barely a few milliseconds!
- So we use a primitive go library's (sync library) function called WaitGroups
- How do WaitGroups work?
- WaitGroup is basically a type of counter that blocks the execution of the goroutine until its internal counter becomes zero.
- Three important procedures to know in WaitGroups:
- Add(int) - It increases WaitGroup counter by given integer value.
- Done() - It decreases WaitGroup counter by 1, we use it to indicate the termination of a goroutine.
- Wait() - It blocks the execution until it's internal counter becomes 0.
- Let us take a code snippet example:
package main
import (
"fmt"
"sync"
"time"
)
func worker(id int) {
fmt.Printf("Worker %d starting\n", id)
time.Sleep(time.Second)
fmt.Printf("Worker %d done\n", id)
}
func main() {
var wg sync.WaitGroup
for i := 1; i <= 5; i++ {
wg.Add(1)
i := i
go func() {
defer wg.Done()
worker(i)
}()
}
wg.Wait()
}
- In the above code snippet, we are executing the worker function in 5 different goroutine calls and its asynchronously running in 5 seperate threads until the end of their execution. At the end of the execution, the waitgroup counter decrements from 5 all the way down to 0 and the Wait() call breaks indicating the end of the threaded execution.
- See how simple WaitGroups make asynchronous goroutine execution without terminating due to the main function and without any time delay!
Go channels
- Let's first see why we need channels as supposed to just a single variables for communication between threads :
Let's consider a like counter...a counter in which a person can either like(+1) or dislike (-1). But before I can explain race conditions. We need to understand the von neumann cycle.
Von Neumann architechture is as so :
fetch->decode->execute->write
What's to note is the "fetch" of data that happens here before execution [that too, 2 steps before exec].
Let's consider that one thread each handles the incoming likes and dislikes. And as we only know normal variables in Go for now, we have no other choice but to have all these threads update the value in this "integer" variables called likes
. Let's now consider 2 requests coming in, both likes at the EXACT SAME TIME. So two threads spawn at the same time and "fetch" the memory of variable likes
. Both do a +1 to likes
variable. and now store it back to the same memory.
WAIIIIIIIITTTT, what just happened??? we got 2 like requests but our value only went up by 1??
this is a very solid introduction to race conditions and we'll discuss in short how we can avoid this. In some cases (in fortunate cases) we will only get inconsistent values, In most unfortunate situations this leads to a runtime error. With this I hope you are sold on channels, now let's look into using them.
- The basic syntax of buffered channels :
first_channel := make(chan string)
first_channel <- "Hello, World!"
our_message := <- first_channel
//We have to use := for "our_message" as the variables needs to figure what it stored by itself, we can also declare its type manually
fmt.Println(our_message)
So in the above code block, we see a channel that is happy to store strings, this is the same for any data type channels. But if you really did run the above code, you'll see we will get a runtime error :(. This is because we told our channel should store strings but did not mention how many, that is we have an unbuffered memory [We'll look into these shortly] . Let's fix that and look into len
and cap
functions of channels.
- Fixing channels :
first_channel := make(chan string,1)
fmt.Println("Before queuing, len :",len(first_channel)) //Shows number of queued messages in the buffer
fmt.Println("Before queuing, cap :",cap(first_channel)) //Shows the total capacity of buffer
first_channel <- "Hello, World!"
fmt.Println("After queuing one string, len :",len(first_channel))
fmt.Println("After queuing one string, cap :",cap(first_channel))
// first_channel <- "Second message :(" //Will cause runtime, is explained in blog.
our_message := <- first_channel
fmt.Println(our_message)
So in the above code, we see a lot going on. Let's break it down :
- We now have a channel size of 1, implying we can store one string in channel.
- the
len()
prints the number of "items" buffered. - the
cap()
prints the max numbers of "items" that can be buffered. Do note, when ever the len() goes above cap() our code panic's in runtime.
- Unbuffered channels basic syntax :
func unbuf_sender(channel chan string, wg *sync.WaitGroup){
defer wg.Done()
defer close(channel)
i := 0
for i<5{
channel <- "Hello World "+fmt.Sprint(i)
i++
}
}
func unbuf_reader(channel chan string, wg *sync.WaitGroup){
defer wg.Done()
for{
res, ok := <- channel
if !ok{
fmt.Println("We done, adios")
return
}
fmt.Println(res)
}
}
func simple_unbuf_channel(){
unbuf_channels := make(chan string)
wg := new(sync.WaitGroup)
wg.Add(1)
go unbuf_sender(unbuf_channels, wg)
wg.Add(1)
go unbuf_reader(unbuf_channels, wg) //Code will panic if this receiver isnt active
wg.Wait()
}
If you are sure if a channels always has an active recviever, we can let our channels be unbuffered! And the above example is accurate 100% of the time!! With that we come to an end of looking at channels and threads communication at quite the depth!
For advanced readers, the race condition problem we discusses earlier has been fixed using channels, ones intrested can see it's code here!
The same example...just much faster!
Here's the same example, Fizzbuzz, but parallelized -
func worker_pool_fizzbuzz(start, end, worker_count int) [][]string {
// use a worker pool to orchestrate the parallelization of Fizzbuzz
jump_div_check := (end - start) % worker_count
if jump_div_check != 0 {
log.Fatalln("The degree must perfectly divide the number of entries")
}
jump := (end - start) / worker_count
iter_start := start
job_channel := make(chan []int, worker_count)
results_channel := make(chan []string, worker_count)
var results [][]string
for i := 0; i < worker_count; i++ {
go func(job_chan <-chan []int, result_chan chan<- []string) {
// Worker!
for start_end_stats := range job_chan {
// Spawn the new job
start := start_end_stats[0]
end := start_end_stats[1]
// Send the results back to the
result_chan <- fizzbuzz_sequential(start, end)
}
}(job_channel, results_channel)
}
for {
if iter_start < end {
range_start := iter_start
range_end := iter_start + jump
// fmt.Println("Dispatching Job (START, END)", range_start+1, range_end)
job_channel <- []int{range_start + 1, range_end}
iter_start += jump
} else {
break
}
}
close(job_channel)
for i := 0; i < worker_count; i++ {
results = append(results, <-results_channel)
}
return results
}
What can you build with Go?
There's plenty of production-grade software written in Go! Here are a few examples you're going to have heard of -
- Docker
- Kubernetes
- Prometheus
- ScyllaDB
- CockroachDB
But there are also easier-to-follow examples and sample projects that we've got in-house. For example,
GRDNS
GRDNS is DNS server, namely : Go Redis Domain Name Server. It's entire point was to have insanely fast resolve times using in memory caches. Go helps with scaling and its very nice network libraries! The project can be found here
Uturn
Uturn is a simple URL shortener (much like bit.ly) written in Go. You can find the source code for the same here