Let's build a Full-Text Search engine
Full-Text Search is one of those tools people use every day without realizing it. If you ever googled "golang coverage report" or tried to find "indoor wireless camera" on an e-commerce website, you used some kind of full-text search.
Full-Text Search (FTS) is a technique for searching text in a collection of documents. A document can refer to a web page, a newspaper article, an email message, or any structured text.
Today we are going to build our own FTS engine. By the end of this post, we'll be able to search across millions of documents in less than a millisecond. We'll start with simple search queries like "give me all documents that contain the word cat" and we'll extend the engine to support more sophisticated boolean queries.
String interning in Go
String interning is a technique of storing only one copy of each unique string in memory. It can significantly reduce memory usage for applications that store many duplicated strings.
Pogreb - key-value store for read-heavy workloads
Note
This post is outdated, please read the new design document on GitHub.
A few months ago I released the first version of an embedded on-disk key-value store written in Go. The store is about 10 times faster than LevelDB for random lookups. I'll explain why it's faster, but first let's talk about the reason why I decided to create my own key-value store.
Porting Go web applications to AWS Lambda
Running Go on AWS Lambda is not something totally new - developers figured out how to launch Go binaries from Python a while ago, but it wasn't convenient and had some performance implications.
A few days ago Amazon announced an official Go support for AWS Lambda.
Handling C++ exceptions in Go
Cgo is a mechanism that allows Go packages call C code. The Go compiler enables cgo for every .go source file that imports a special pseudo package "C". The text in the comment before the import "C" line is treated as a C code. You can include headers, define functions, types and variables - everything a normal C code can do:
package main
/*
#include <stdio.h>
void foo(int x) {
printf("x: %d\n", x);
}
*/
import "C"
func main() {
C.foo(C.int(123)) // x: 123
}
Profiling and optimizing Go web applications
Note
This post was updated on 2021-04-25.
Go has a powerful built-in profiler that supports CPU, memory, goroutine and block (contention) profiling.
Enabling the profiler
Go provides a low-level profiling API runtime/pprof, but if you are developing a long-running service, it's more convenient to work with a high-level net/http/pprof package.
All you need to enable the profiler is to import net/http/pprof and it will automatically register the required HTTP handlers:
package main
import (
"net/http"
_ "net/http/pprof"
)
func hiHandler(w http.ResponseWriter, r *http.Request) {
w.Write([]byte("hi"))
}
func main() {
http.HandleFunc("/", hiHandler)
http.ListenAndServe(":8080", nil)
}
Iceland 2016
Summers are so hot in Philadelphia, we decided to chill out a little bit in Iceland on the way back to the USA from Russia. Also, spending a vacation only for Russia is a total waste.
Snowboarding in the USA 2016
Summer is the best time to recall the good cold winter days. During the 2015-2016 winter I had a chance to go snowboarding in Colorado and Pennsylvania.
Scraping the Web with AWS Lambda and PhantomJS
Here are the slides from my talk "Scraping the Web with AWS Lambda and PhantomJS" given at Greater Philadelphia AWS User Group meetup on May 25, 2016.
You can find the source code of PhantomJS/Node.js web scraper for AWS Lambda at https://github.com/akrylysov/lambda-phantom-scraper.
Turks & Caicos
Late December, 70 °F (21 °C), the winter hasn't arrived in Philadelphia yet, but we decided to spend our "winter break" lying on the beach. There are many destinations within a short flight from the USA - Mexico, Cuba and a dozen of small island countries. For some places like Florida, Hawaii, Puerto Rico, US Virgin Islands you don't even need to leave the territory of the country.