Go Channels & Synchronization

Coordinating Goroutine Lifecycles with WaitGroups and ErrGroups

Master the sync package to effectively wait for task completion and handle error propagation across multiple concurrent processes.

ProgrammingIntermediate12 min read

In this article

Orchestrating Parallel Execution with WaitGroups

Managing the Pitfalls of Pointer Passing

Handling Error Propagation with ErrGroup

The Power of Structured Concurrency

Protecting Integrity with Mutual Exclusion

Choosing Between Mutexes and Channels

Orchestrating Parallel Execution with WaitGroups

Go is famous for making concurrency feel effortless through the use of the go keyword. However, spawning a goroutine is only the beginning of building a production-ready system. The real challenge lies in coordination, ensuring that your main process does not exit before your background workers have finished their tasks. In a production environment, an uncoordinated exit can lead to corrupted data, half-written files, or lost network requests.

While channels are a powerful way to pass data between goroutines, using them solely for synchronization can often feel verbose and clumsy. If your only goal is to wait for a set of workers to complete, the sync package provides a more streamlined primitive called the WaitGroup. This tool acts as a synchronized counter that allows the main thread to block until every individual worker has signaled that its work is done.

The mental model for a WaitGroup is similar to a ticket counter at a stadium entrance. You increment the counter for every person who enters the building and decrement it as they exit. The stadium cannot be closed for the night until the counter returns to exactly zero. This simple counting mechanism provides a robust way to manage the lifecycle of parallel processes without the overhead of channel buffers or complex signaling logic.

goParallel Batch Image Processing

1package main
2
3import (
4	"fmt"
5	"sync"
6	"time"
7)
8
9func processBatch(id int, wg *sync.WaitGroup) {
10	// Use defer to ensure Done is called even if a panic occurs
11	defer wg.Done()
12
13	fmt.Printf("Worker %d: Starting image transformation...\n", id)
14	time.Sleep(time.Second) // Simulate heavy CPU work
15	fmt.Printf("Worker %d: Image processed successfully\n", id)
16}
17
18func main() {
19	var wg sync.WaitGroup
20	images := []int{101, 102, 103, 104, 105}
21
22	for _, id := range images {
23		// Increment before starting the goroutine to prevent a race
24		wg.Add(1)
25		go processBatch(id, &wg)
26	}
27
28	// Wait blocks until the internal counter reaches zero
29	wg.Wait()
30	fmt.Println("Main: All images processed. Exiting.")
31}

The implementation details of a WaitGroup are deceptively simple but require discipline to use correctly. One of the most common mistakes is calling the Add method inside the spawned goroutine rather than before it. If the scheduler delays the execution of the goroutine, the main routine might reach the Wait call while the counter is still zero, causing the program to exit prematurely. Always call Add in the calling routine to guarantee the counter reflects the intended workload immediately.

Managing the Pitfalls of Pointer Passing

A critical rule when working with the sync package is that you must never copy a WaitGroup after it has been used. Because a WaitGroup contains an internal state that tracks the number of active workers, copying the struct results in a copy of that counter. If you pass a WaitGroup by value to a function, the function will increment or decrement a local copy, leaving the original counter in the main routine unchanged.

This mistake typically results in a deadlock where the main routine waits forever for workers that have already finished, or the program exits before work is complete. To avoid this, always pass a pointer to the WaitGroup as shown in the previous code example. This ensures that every worker is operating on the exact same memory address and the same underlying counter state.

Always call Add before the go keyword to avoid race conditions with Wait
Use defer to call Done to ensure the counter is decremented even during unexpected errors
Pass WaitGroups by pointer to maintain a single source of truth for the counter
Never call Add with a negative number, as this will trigger an immediate panic

Handling Error Propagation with ErrGroup

While a standard WaitGroup is excellent for tracking completion, it has a significant limitation: it does not handle errors. In a real-world application, such as an API aggregator or a cloud resource provisioner, it is rare for all parallel tasks to succeed every time. If one worker fails, the WaitGroup simply continues waiting for the rest, leaving the developer to find a manual way to collect and report those failures.

The errgroup package, part of the Go extended library, was designed to solve this exact problem. It provides a more sophisticated version of the WaitGroup that tracks the success or failure of each goroutine. It simplifies error propagation by capturing the first error that occurs and returning it to the main caller, allowing for a more graceful and responsive error handling strategy.

Beyond just collecting errors, an ErrGroup can also be used to implement short-circuiting. By combining it with the context package, an ErrGroup can automatically cancel all other active tasks the moment one of them fails. This prevents your system from wasting valuable CPU cycles and network bandwidth on a batch job that has already failed a critical step.

goResilient API Aggregation

1package main
2
3import (
4	"context"
5	"errors"
6	"fmt"
7	"golang.org/x/sync/errgroup"
8	"net/http"
9)
10
11func fetchStatus(ctx context.Context, url string) error {
12	req, _ := http.NewRequestWithContext(ctx, "GET", url, nil)
13	resp, err := http.DefaultClient.Do(req)
14	if err != nil {
15		return err // Errors are automatically captured by the group
16	}
17	defer resp.Body.Close()
18
19	if resp.StatusCode != http.StatusOK {
20		return errors.New("failed to reach service: " + url)
21	}
22	return nil
23}
24
25func main() {
26	g, ctx := errgroup.WithContext(context.Background())
27	urls := []string{"https://api.one.com", "https://api.two.com"}
28
29	for _, url := range urls {
30		url := url // Capture loop variable
31		g.Go(func() error {
32			return fetchStatus(ctx, url)
33		})
34	}
35
36	if err := g.Wait(); err != nil {
37		fmt.Printf("Aggregation failed: %v\n", err)
38		return
39	}
40	fmt.Println("All services are healthy")
41}

The Power of Structured Concurrency

The combination of ErrGroup and context is a prime example of structured concurrency in Go. This pattern ensures that the lifetime of a goroutine is strictly tied to the function that launched it. By using the context returned from errgroup.WithContext, workers can monitor for a cancellation signal and clean up resources immediately when a sibling fails.

This approach is significantly cleaner than manually managing channels for error reporting. It removes the need for select blocks or complex switch statements just to determine if a job was successful. For any non-trivial concurrent operation that involves input and output, the ErrGroup should be your default choice over a standard WaitGroup.

Protecting Integrity with Mutual Exclusion

In some scenarios, goroutines are not just performing isolated tasks but are actively modifying a shared piece of state. For instance, if you are building a high-frequency trading system or a real-time analytics dashboard, multiple threads might attempt to update a central cache or a counter at the same time. Accessing the same memory location from multiple goroutines simultaneously creates a race condition.

A race condition is particularly dangerous because it does not always cause a crash during development. Instead, it leads to silent data corruption or unpredictable behavior that only appears under high production load. To prevent this, the sync package offers a Mutex, which stands for mutual exclusion, to ensure that only one goroutine can access a critical section of code at a time.

The golden rule of Go concurrency is to share memory by communicating, but when that is not feasible, use a Mutex to ensure that your shared memory is never accessed by two threads at the same time. Performance without correctness is simply a faster way to get the wrong answer.

A Mutex acts like a physical lock on a room. Before a goroutine can read from or write to a protected variable, it must first acquire the lock. If another goroutine already holds the lock, the second goroutine will block and wait its turn. Once the first goroutine is finished and releases the lock, the waiting goroutine can proceed, ensuring total data integrity.

goBuilding a Thread-Safe In-Memory Cache

1package main
2
3import (
4	"fmt"
5	"sync"
6)
7
8type SecureCache struct {
9	mu    sync.Mutex
10	items map[string]string
11}
12
13func (c *SecureCache) Update(key, value string) {
14	c.mu.Lock()         // Acquire exclusive access
15	defer c.mu.Unlock() // Ensure release even if logic changes
16	c.items[key] = value
17}
18
19func (c *SecureCache) Fetch(key string) string {
20	c.mu.Lock()
21	defer c.mu.Unlock()
22	return c.items[key]
23}
24
25func main() {
26	cache := &SecureCache{items: make(map[string]string)}
27	var wg sync.WaitGroup
28
29	for i := 0; i < 10; i++ {
30		wg.Add(1)
31		go func(v int) {
32			defer wg.Done()
33			key := fmt.Sprintf("key-%d", v)
34			cache.Update(key, "processed")
35		}(i)
36	}
37
38	wg.Wait()
39	fmt.Println("Cache updates complete and race-free")
40}

Choosing Between Mutexes and Channels

Choosing the right tool depends on the underlying problem you are trying to solve. Mutexes are generally faster and more efficient for protecting simple state or fine-grained data structures like maps and slices. Channels are better suited for managing the flow of data or for complex coordination where different parts of the system need to communicate their progress.

If your mental model involves a pipeline where data moves from one stage to the next, use channels. If your model involves a single source of truth that multiple workers need to read and update, a Mutex is usually the more appropriate and higher-performance choice. Balancing these two primitives is a key skill for any intermediate Go developer.

Choosing Between Mutexes and Channels for Shared State Management All Go Channels & Synchronization Articles