Working With Forks in Go

This been written about before in the Go universe, but I felt it was worth reviewing as it has come up many times in the Go slack (invite link). When developing on a separate remote from the original in Go, there is a common point of confusion when a developer forks on Github (for example) and then proceeds to go get the fork. This places the fork in the wrong spot in their $GOPATH and none of the imports work. Read On →

Managing Syscall Overhead with crypto/rand

The overhead of using secure random numbers can be a headache if the generation of those numbers is in your server’s critical path. In this post, I’ll look at a couple of techniques to bypass the overhead of generating random numbers in a Go program and make a recommendation on what method to use. Consider an application that needs to generate a nonce for each request which is also I/O bound, meaning it does more waiting on I/O than anything else. Read On →

How to Block Forever in Go

Let me count the ways There seems to be quite a few ways to block forever in Go. This post is part code golf and part practical advice. All of the code below is simply to find a way to block the current goroutine from progressing while still allowing others to continue. The versions that block forever will actually cause a panic saying that all goroutines are asleep, but the ones that “busy block” will just time out on the playground. Read On →

Locking in crypto/rand

As a followup to my previous post detailing my journey through some profiling of the math/rand package, I wanted to write about the crypto/rand package. A couple people have suggested that I take a look at that instead of worrying about locking in math/rand. On the surface, it’s an easy interface that fills a byte slice full of cryptographically secure random data. I modified the rand_default.go program from the previous post to create a new program to pull data from crypto/rand instead of math/rand. Read On →

The Hidden Dangers of Default Rand

This post is based on a real-world problem that I faced when I was developing a load generator program for work. I looked through my code carefully to make sure that there weren’t any bottlenecks in the design or bugs in the implementation, but didn’t find anything explicitly wrong. I did some profiling after that and found something very interesting: deep down in the rand package was a lock that was taking up the majority of the time in the application. Read On →

An Analysis of the Top 1000 Go Repositories

This analysis was done from copies cloned on January 2, 2016 early morning Pacific Time. Code organization Most code is a library, so the code is organized as either .go files under the main repo, or as .go files under sub-directories. Many people also organize their code under a sub-directory, like /src, /lib/, /go/, or /pkg/. I can’t manually inspect all of the repositories, but those I did check are apps written in go rather than libraries. Read On →

The Insanity of Generating All Possible GUIDs

Genesis There was once a StackOverflow question (“was” because it’s gone now) title “Fastest way in C# to iterate through ALL Guids Possible”. The premise was that the person wanted to hit a web server with a single request per GUID to determine if the server had data for it. Hilarity ensued. Suggestions Some people were legitimately trying to be helpful in this quixotic quest. Some pointed out: Floating point math mistakes: 5316911983139663491615228241121399999d==(5316911983139663491615228241121399999d+‌​1d) is true. Read On →

Widow: Web Crawler Architecture

Widow In a previous post, I went over the justification for building my own web crawler named Widow. Here I will explain my alternative method for building a large-scale web crawler. Keep in mind that the crawler is still a work in progress (as of the end of 2015) so this is not final. There is still some future work to be done, which will be discussed at the end. Architecture Overview There’s three main stages to the online crawler. Read On →

Goroutine IDs

Do Goroutine IDs Exist? Yes, of course. The runtime has to have some way to track them. Should I use them? No. No. No. No. Are There Packages I Can Use? There exist packages from people on the Go team, with colorful descriptions like “If you use this package, you will go straight to hell.“ There are also packages to do goroutine local storage built on top of goroutine IDs, like github.com/jtolds/gls and github.com/tylerb/gls that use dirty knowledge from the above to create an environment contrary to Go design principles. Read On →

Why I Decided to Make My Own Web Crawler

Widow The web crawler I am making is named Widow, and is freely available on GitHub. Along with Widow, there are a couple of other sub-projects that were, in my mind, necessary to have a decent crawler. It must also be able to parse the robots.txt and sitemap.xml files that websites use to direct web crawler behavior. The projects to perform both of those functions are terminator and exo. They are also available on Maven Central under the com.widowcrawler namespace. Read On →