Huge performance difference query mysql database with same golang snippet

Huge performance difference query mysql database with same golang snippet - c++

I reimplement my project with golang recently. The project was implemented with C++. When I finished the code and have a performance test. I'm shocked by the result. When I query the database with C++, I can get the 130 million rows result in 5 mins. But with golang, it's almost 45 mins. But when I separate the code from the project and build the code snippet, it's finished in 2mins. Why does they have so much huge difference performance result?
My code snippet :
https://gist.github.com/pyanfield/2651d23311901b33c5723b7de2364148
package main
import (
"database/sql"
"fmt"
"runtime"
"strconv"
"time"
_ "github.com/go-sql-driver/mysql"
)
func main() {
runtime.GOMAXPROCS(runtime.NumCPU())
// defer profile.Start(profile.CPUProfile, profile.ProfilePath(".")).Stop()
dbRead, err := connectDB("test:test#tcp(127.0.0.1:3306)/test_oltp?charset=utf8&readTimeout=600s&writeTimeout=600s")
if err != nil {
fmt.Printf("Error happend when connecting to DB. %s\n", err.Error())
return
}
defer dbRead.Close()
dbRead.SetMaxIdleConns(0)
dbRead.SetMaxOpenConns(100)
query := fmt.Sprintf("WHERE company_id in (11,22,33,44,55,66,77,88,99,00,111,222,333,4444,555,666,777,888,999)")
relations := getRelations(dbRead, query)
}
func connectDB(addr string) (*sql.DB, error) {
db, err := sql.Open("mysql", addr)
if err != nil {
return nil, err
}
if err = db.Ping(); err != nil {
return nil, err
}
return db, nil
}
type Relation struct {
childId int64
parentId int64
}
func getRelations(db *sql.DB, where string)[]Relation {
begin := time.Now()
var err error
var rows *sql.Rows
query := fmt.Sprintf("SELECT `child_id`, `parent_id` FROM `test_relations` %s", where)
rows, err = db.Query(query)
if err != nil {
fmt.Println("query error:", err.Error())
return nil
}
defer rows.Close()
columns, err := rows.Columns()
buffer := make([]sql.RawBytes, len(columns))
scanArgs := make([]interface{}, len(buffer))
for i := range scanArgs {
scanArgs[i] = &buffer[i]
}
relations := []Relation{}
relation := Relation{}
for rows.Next() {
if err = rows.Scan(scanArgs...); err != nil {
fmt.Println("scan:", err.Error())
return nil
}
relation.parentId, _ = strconv.ParseInt(string(buffer[1]), 10, 64)
relation.childId, _ = strconv.ParseInt(string(buffer[0]), 10, 64)
relations = append(relations, relation)
}
if err = rows.Err(); err != nil {
fmt.Println("next error:", err.Error())
return nil
}
fmt.Printf(">>> getRelations cost: %s\n", time.Since(begin).String())
// output :>>> getRelations cost:1m45.791047s
return relations
// len(relations): 131123541
}
Update:
My go version is 1.6. The cpu profile I got are as below:
The Code Snippet profile top20:
75.67s of 96.82s total (78.16%)
Dropped 109 nodes (cum <= 0.48s)
Showing top 20 nodes out of 82 (cum >= 12.04s)
flat flat% sum% cum cum%
11.85s 12.24% 12.24% 11.85s 12.24% runtime.memmove
10.28s 10.62% 22.86% 20.01s 20.67% runtime.mallocgc
5.82s 6.01% 28.87% 5.82s 6.01% strconv.ParseUint
5.79s 5.98% 34.85% 5.79s 5.98% runtime.futex
3.42s 3.53% 38.38% 10.28s 10.62% github.com/go-sql-driver/mysql.(*buffer).readNext
3.42s 3.53% 41.91% 6.38s 6.59% runtime.scang
3.37s 3.48% 45.39% 36.97s 38.18% github.com/go-sql-driver/mysql.(*textRows).readRow
3.37s 3.48% 48.87% 3.37s 3.48% runtime.memclr
3.20s 3.31% 52.18% 3.20s 3.31% runtime.heapBitsSetType
3.02s 3.12% 55.30% 7.36s 7.60% database/sql.convertAssign
2.96s 3.06% 58.36% 3.02s 3.12% runtime.(*mspan).sweep.func1
2.53s 2.61% 60.97% 2.53s 2.61% runtime._ExternalCode
2.39s 2.47% 63.44% 2.96s 3.06% runtime.readgstatus
2.24s 2.31% 65.75% 8.06s 8.32% strconv.ParseInt
2.21s 2.28% 68.03% 5.24s 5.41% runtime.heapBitsSweepSpan
2.15s 2.22% 70.25% 7.68s 7.93% runtime.rawstring
2.06s 2.13% 72.38% 3.18s 3.28% github.com/go-sql-driver/mysql.readLengthEncodedString
1.95s 2.01% 74.40% 12.23s 12.63% github.com/go-sql-driver/mysql.(*mysqlConn).readPacket
1.83s 1.89% 76.29% 79.42s 82.03% main.Relations
1.81s 1.87% 78.16% 12.04s 12.44% runtime.slicebytetostring
The project cpu profile top20:
(pprof) top20
38.71mins of 42.82mins total (90.40%)
Dropped 334 nodes (cum <= 0.21mins)
Showing top 20 nodes out of 76 (cum >= 1.35mins)
flat flat% sum% cum cum%
12.02mins 28.07% 28.07% 12.48mins 29.15% runtime.addspecial
5.95mins 13.89% 41.96% 15.08mins 35.21% runtime.pcvalue
5.26mins 12.29% 54.25% 5.26mins 12.29% runtime.readvarint
2.60mins 6.08% 60.32% 7.87mins 18.37% runtime.step
1.98mins 4.62% 64.94% 19.45mins 45.43% runtime.gentraceback
1.65mins 3.86% 68.80% 1.65mins 3.86% runtime/internal/atomic.Xchg
1.57mins 3.66% 72.46% 2.93mins 6.84% runtime.(*mspan).sweep
1.52mins 3.54% 76.01% 1.78mins 4.15% runtime.findfunc
1.41mins 3.30% 79.31% 1.42mins 3.31% runtime.markrootSpans
1.13mins 2.64% 81.95% 1.13mins 2.64% runtime.(*fixalloc).alloc
0.64mins 1.50% 83.45% 0.64mins 1.50% runtime.duffcopy
0.46mins 1.08% 84.53% 0.46mins 1.08% runtime.findmoduledatap
0.44mins 1.02% 85.55% 0.44mins 1.02% runtime.fastrand1
0.42mins 0.97% 86.52% 15.49mins 36.18% runtime.funcspdelta
0.38mins 0.89% 87.41% 36.02mins 84.13% runtime.mallocgc
0.30mins 0.7% 88.12% 0.78mins 1.83% runtime.scanobject
0.26mins 0.6% 88.72% 0.32mins 0.74% runtime.stkbucket
0.26mins 0.6% 89.32% 0.26mins 0.6% runtime.memmove
0.23mins 0.55% 89.86% 0.23mins 0.55% runtime.heapBitsForObject
0.23mins 0.53% 90.40% 1.35mins 3.15% runtime.lock

I got my answer and want to share it. This is caused by my mistake. Sometimes ago, I tried to add memory profile and set runtime. MemProfileRate=1 in my init method. But I forgot to reset it to a reasonable value. I ignored this method when I checked my code every time. After removing this setting from my project, it returns to normal, and spend almost 5~6mins to query these 130M datas. The speed is pretty close to the C++ version. My advise is that please carefully when you set runtime.MemProfileRate=1 unless you make sure you want to do that, and remember to reset it back.

Golang is likely running the DB query processing more in parallel for the snippet alone. Your complete application is almost certainly using some of those cores for other things.
The loop where you process all 130M rows seems the likely culprit.
Try setting the max procs to 1 in the snippet if you want to test this theory.

Related

Golang: How to test functions that use time.Now()? [duplicate]

This question already has answers here:
Is there an easy way to stub out time.Now() globally during test?
(10 answers)
Golang testing programs that involves time
(1 answer)
Closed 21 days ago.
I have a function that is calculating the number of days since a particular timestamp, where the timestamp is coming from an external API (parsed as string in json return from API)
I have been following this article on how to test functions that use time.Now():
https://medium.com/go-for-punks/how-to-test-functions-that-use-time-now-ea4f2453d430
My function looks like this:
type funcTimeType func() time.Time // per suggested in article
func ageOfReportDays(dateString string, funcTime funcTimeType) {
// date string will look like this:
//"2022-08-30 09:05:27.567995"
parseLayout := "2006-01-02 15:04:05.000000"
t, err := time.Parse(parseLayout, dateString)
if err != nil {
fmt.Printf("Error parsing datetime value %v: %w", timeStr, err)
}
days := int(time.Since(t).Abs().Hours() / 24)
//fmt.Println(days)
return days, nil
}
As you can see, I am not using the funcTime funcTimeType in my actual function, as indicated in the article, because I cannot figure out how my function would be implemented with that.
The unit test I would hope to run would be something like this:
func Test_ageOfReportDays(t *testing.T) {
t.Run("timestamp age in days test", func(t *testing.T) {
parseLayout := "2006-01-02 15:04:05.000000"
dateString := "2022-08-30 09:05:27.567995" // example of recent timestamp
mockNow := func() time.Time {
fakeTime, _ := time.Parse(parseLayout, "2023-01-20 09:00:00.000000")
return fakeTime
}
// now I want to use "fakeTime" to spoof "time.Now()" so I can test my function
got: ageOfReportDays(dateString, mockNow)
expected: 152
if got != expected {
t.Errorf("expected '%d' but got '%d'", expected, got)
}
}
Obviously the logic is not quite with my code vs article author's code.
Is there a good way for me to write a unit test for this funcition, based on how the article is suggesting to mock time.Now()?

You are pretty close. Changing time.Since(t) to funcTime().Sub(t) would probably get you passed the finish line.
From time package docs:
time.Since returns the time elapsed since t. It is shorthand for time.Now().Sub(t).
Example function:
import (
"fmt"
"time"
)
const parseLayout = "2006-01-02 15:04:05.000000"
type funcTimeType func() time.Time // per suggested in article
func ageOfReportDays(dateString string, funcTime funcTimeType) (int, error) {
t, err := time.Parse(parseLayout, dateString)
if err != nil {
return 0, fmt.Errorf("parsing datetime value %v: %w", dateString, err)
}
days := int(funcTime().Sub(t).Hours() / 24)
//fmt.Println(days)
return days, nil
}
And a test:
import (
"testing"
"time"
)
func Test_ageOfReportDays(t *testing.T) {
t.Run("timestamp age in days test", func(t *testing.T) {
dateString := "2022-08-30 09:05:27.567995" // example of recent timestamp
mockNow := func() time.Time {
fakeTime, _ := time.Parse(parseLayout, "2023-01-20 09:00:00.000000")
return fakeTime
}
// now I want to use "fakeTime" to spoof "time.Now()" so I can test my function
got, _ := ageOfReportDays(dateString, mockNow)
expected := 142
if got != expected {
t.Errorf("expected '%d' but got '%d'", expected, got)
}
})
}

Unit Testing sqlboiler with sqlmock failing select

I'm trying to mock a method written with sqlboiler but I'm having massive trouble building the mock-query.
The model I'm trying to mock looks like this:
type Course struct {
ID int, Name string, Description null.String, EnrollKey string, ForumID int,
CreatedAt null.Time, UpdatedAt null.Time, DeletedAt null.Time,
R *courseR, L courseL
}
For simplicity I want to test the GetCourse-method
func (p *PublicController) GetCourse(id int) (*models.Course, error) {
c, err := models.FindCourse(context.Background(), p.Database, id)
if err != nil {
return nil, err
}
return c, nil
}
with this test
func TestGetCourse(t *testing.T) {
db, mock, err := sqlmock.New()
if err != nil {
t.Fatalf("an error '%s' was not expected", err)
}
oldDB := boil.GetDB()
defer func() {
db.Close()
boil.SetDB(oldDB)
}()
boil.SetDB(db)
ctrl := &PublicController{db}
rows := sqlmock.NewRows([]string{"ID", "Name", "Description", "EnrollKey", "ForumID"}).AddRow(42, "Testkurs", "12345", 33)
query := regexp.QuoteMeta("SELECT ID, Name, Description, EnrollKey, ForumID FROM courses WHERE ID = ?")
//mockQuery := regexp.QuoteMeta("SELECT * FROM `courses` WHERE (`course AND (`courses`.deleted_at is null) LIMIT 1;")
mock.ExpectQuery(query).WithArgs(42).WillReturnRows(rows)
course, err := ctrl.GetCourse(42)
assert.NotNil(t, course)
assert.NoError(t, err)
}
But running this test only returns
Query: could not match actual sql: "select * from course where id=? and deleted_at is null" with expected regexp "SELECT ID, Name, Description, EnrollKey, ForumID FROM courses WHERE ID = ?"
bind failed to execute query
And I can't really find out how to construct it correctly.
How do I correctly mock the sqlboiler-query for running unit tests?
UPDATE
I managed to solve this by using different parameters in AddRow()
.AddRow(c.ID, c.Name, null.String{}, c.EnrollKey, c.ForumID)
and building the query differently
query := regexp.QuoteMeta("select * from `course` where `id`=? and `deleted_at` is null")
Now my issue is that in contrast to this method the others have a very large complexity in comparison with a large amount of complex queries (mainly insert-operations). From my understanding, sqlboiler-tests needs to mimic every single interaction made with the database.
How do I extract the necessary queries for the large amount of database interactions? I solved my problem by just using the "actual sql-query" instead of the previously used one but I'm afraid this procedure is the opposite of efficient testing.

Go templates: Currency pipe format?

I'm trying to represent money in a go template.
{{.cash}}
But right now, cash comes as 1000000
Would it be possible to make it output 1,000,000 ?
Is there some sort of {{.cash | Currency}} formatter?
If not, how do I go about getting the desired output?
Thanks.

You can leverage github.com/dustin/go-humanize to do this.
funcMap := template.FuncMap{
"comma": humanize.Comma,
}
t := template.New("").Funcs(templateFuncs).Parse(`A million: {{comma .}}`)
err := tmpl.Execute(os.Stdout, 1000000)
if err != nil {
log.Fatalf("execution: %s", err)
}
// A million: 1,000,000

Why does this concurrent HTTP client randomly crash?

I'm learning Go by writing an HTTP testing client like Apache's ab. The code below seems pretty straightforward: I create a configurable number of goroutines, each of which sends a portion of the overall HTTP requests and records the result. I iterate over the resultChan channel and inspect/record each result. This works find when the number of messages is, say, 100. When I increase the number of messages, however, it hangs and htop shows VIRT of 138G for the process.
Here's the code in question:
package main
import "net/http"
import "fmt"
import "time"
const (
SUCCESS = iota
TOTAL = iota
TIMEOUT = iota
ERROR = iota
)
type Result struct {
successful int
total int
timeouts int
errors int
duration time.Duration
}
func makeRequests(url string, messages int, resultChan chan<- *http.Response) {
for i := 0; i < messages; i++ {
resp, _ := http.Get(url)
if resp != nil {
resultChan <- resp
}
}
}
func deployRequests(url string, threads int, messages int) *Result {
results := new (Result)
resultChan := make(chan *http.Response)
start := time.Now()
defer func() {
fmt.Printf("%s\n", time.Since(start))
}()
for i := 0; i < threads; i++ {
go makeRequests(url, (messages/threads) + 1, resultChan)
}
for response := range resultChan {
if response.StatusCode != 200 {
results.errors += 1
} else {
results.successful += 1
}
results.total += 1
if results.total == messages {
return results
}
}
return results
}
func main () {
results := deployRequests("http://www.google.com", 10, 1000)
fmt.Printf("Total: %d\n", results.total)
fmt.Printf("Successful: %d\n", results.successful)
fmt.Printf("Error: %d\n", results.errors)
fmt.Printf("Timeouts: %d\n", results.timeouts)
fmt.Printf("%s", results.duration)
}
There are obviously some things missing or stupidly done (no timeout checking, channel is synchronous, etc) but I wanted to get the basic case working before fixing those. What is it about the program as written that causes so much memory allocation?
As far as I can tell, there are just 10 goroutines. If one is created per HTTP request, which would make sense, how does one perform operations that would create many goroutines in a loop? Or is the issue totally unrelated.

I think the sequence leading to the hang is:
http.Get in makeRequests fails (connection denied, request timeout, etc.), returning a nil response and an error value
The error is ignored and makeRequests moves on to the next request
If any errors occur, makeRequests posts less than the expected number of results to resultChan
The for .. range .. chan loop in deployRequests never breaks because results.total is always less than messages
One workaround would be:
If http.Get returns an error value, post a nil response to resultChan:
resp, err := http.Get(url)
if err != nil {
resultChan <- nil
} else if resp != nil {
resultChan <- resp
}
In deployRequests, if the for loop reads a nil value from resultChan, count that as an error:
for response := range resultChan {
if response == nil {
results.errors += 1
} else if response.StatusCode != 200 {
// ...

unit testing golang handler

Here's a handler I wrote to retrieve a document from mongodb.
If the document is found, we will load and render the template accordingly.
If it fails, it will redirect to 404.
func EventNextHandler(w http.ResponseWriter, r *http.Request) {
search := bson.M{"data.start_time": bson.M{"$gte": time.Now()}}
sort := "data.start_time"
var result *Event
err := db.Find(&Event{}, search).Sort(sort).One(&result)
if err != nil && err != mgo.ErrNotFound {
panic(err)
}
if err == mgo.ErrNotFound {
fmt.Println("No such object in db. Redirect")
http.Redirect(w, r, "/404/", http.StatusFound)
return
}
// TODO:
// This is the absolute path parsing of template files so tests will pass
// Code can be better organized
var eventnext = template.Must(template.ParseFiles(
path.Join(conf.Config.ProjectRoot, "templates/_base.html"),
path.Join(conf.Config.ProjectRoot, "templates/event_next.html"),
))
type templateData struct {
Context *conf.Context
E *Event
}
data := templateData{conf.DefaultContext(conf.Config), result}
eventnext.Execute(w, data)
}
Manually trying this out, everything works great.
However, I can't seem to get this to pass my unit tests. In a corresponding test file, this is my test code to attempt to load my EventNextHandler.
func TestEventNextHandler(t *testing.T) {
// integration test on http requests to EventNextHandler
request, _ := http.NewRequest("GET", "/events/next/", nil)
response := httptest.NewRecorder()
EventNextHandler(response, request)
if response.Code != http.StatusOK {
t.Fatalf("Non-expected status code%v:\n\tbody: %v", "200", response.Code)
}
}
The test fails at the line stating EventNextHandler(response, request).
And in my error message, it refers to the line err := db.Find(&Event{}, search).Sort(sort).One(&result) in my handler code.
Complete error message here:-
=== RUN TestEventNextHandler
--- FAIL: TestEventNextHandler (0.00 seconds)
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
panic: runtime error: invalid memory address or nil pointer dereference
[signal 0xb code=0x1 addr=0x8 pc=0x113eb8]
goroutine 4 [running]:
testing.func·004()
/usr/local/go/src/pkg/testing/testing.go:348 +0xcd
hp/db.Cursor(0xc2000c3cf0, 0xc2000fb1c0, 0x252d00)
/Users/calvin/gopath/src/hp/db/db.go:57 +0x98
hp/db.Find(0xc2000c3cf0, 0xc2000fb1c0, 0x252d00, 0xc2000e55c0, 0x252d00, ...)
/Users/calvin/gopath/src/hp/db/db.go:61 +0x2f
hp/event.EventNextHandler(0xc2000e5580, 0xc2000a16a0, 0xc2000c5680)
/Users/calvin/gopath/src/hp/event/controllers.go:106 +0x1da
hp/event.TestEventNextHandler(0xc2000d5240)
/Users/calvin/gopath/src/hp/event/controllers_test.go:16 +0x107
testing.tRunner(0xc2000d5240, 0x526fe0)
/usr/local/go/src/pkg/testing/testing.go:353 +0x8a
created by testing.RunTests
/usr/local/go/src/pkg/testing/testing.go:433 +0x86b
goroutine 1 [chan receive]:
testing.RunTests(0x3bdff0, 0x526fe0, 0x1, 0x1, 0x1, ...)
/usr/local/go/src/pkg/testing/testing.go:434 +0x88e
testing.Main(0x3bdff0, 0x526fe0, 0x1, 0x1, 0x532580, ...)
/usr/local/go/src/pkg/testing/testing.go:365 +0x8a
main.main()
hp/event/_test/_testmain.go:43 +0x9a
What is the correct way to write my tests? To take into account the situation where nothing gets retrieved from mongodb and apply an assertion to verify that, thus writing a validated test.

It appears that I did not initialize db in my test. Much like a "setup" step similar to all other languages' unit testing framework.
I have to ensure that my db package is imported and then implement two lines to Connect to the database and register the indexes.
func TestEventNextHandler(t *testing.T) {
// set up test database
db.Connect("127.0.0.1", "test_db")
db.RegisterAllIndexes()
// integration test on http requests to EventNextHandler
request, _ := http.NewRequest("GET", "/events/next/", nil)
response := httptest.NewRecorder()
EventNextHandler(response, request)
if response.Code != 302 {
t.Fatalf("Non-expected status code %v:\n\tbody: %v", "200", response.Code)
}
}

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Huge performance difference query mysql database with same golang snippet - c++

Related

Golang: How to test functions that use time.Now()? [duplicate]

Unit Testing sqlboiler with sqlmock failing select

Go templates: Currency pipe format?

Why does this concurrent HTTP client randomly crash?

unit testing golang handler

Categories

Resources