RootTrust Labs Tech Journal: spin_lock_irqsave() and 'spinlock bad magic' in Linux Kernel

spin_lock_irqsave() and "spinlock bad magic" — Comprehensive Guide

Updated for Linux kernel 6.x. Practical guide, code examples, debugging checklist and diagrams for blog publication.

Contents

1. Overview
2. Purpose of spin_lock_irqsave()
3. When to use it
4. Understanding spinlock bad magic
5. Common causes
6. Relation between irqsave & bad magic
7. Correct vs incorrect examples
8. Debugging
9. Developer checklist
10. Summary

1. Overview

The spin_lock_irqsave() function is a commonly used kernel primitive where drivers share data between interrupt context (hard IRQ) and process context. Developers occasionally see the kernel message "spinlock bad magic" — a debug-time detection of a corrupted or misused spinlock structure. This article explains both topics in depth, how they relate, and how to debug and avoid the issue.

2. Purpose of `spin_lock_irqsave()`

Use spin_lock_irqsave() when a spinlock is shared between interrupt and process contexts. It:

disables local CPU interrupts (saving previous flags),
acquires the spinlock, and
prevents an IRQ from re-entering the lock on the same CPU (avoids self-deadlock).

Typical usage pattern:

unsigned long flags;
spin_lock_irqsave(&lock, flags);
/* critical section — must not sleep */
spin_unlock_irqrestore(&lock, flags);

3. When to use `spin_lock_irqsave()`

Use it if:

The lock is accessed both from an IRQ handler and from process context (workqueue/kernel thread).
The protected section must be atomic across both contexts.
You need to preserve the interrupt state and restore it afterwards.

Note: It's valid and common to call spin_lock_irqsave() in process context (kthreads/workqueues).

4. Understanding `"spinlock bad magic"`

The kernel’s spinlock debugging (enabled with CONFIG_DEBUG_SPINLOCK) stores a sentinel "magic" number in each spinlock_t. This helps detect misuse and memory corruption. If the magic value differs from the expected sentinel (e.g. 0xdead4ead), the kernel prints spinlock bad magic.

It means the lock structure is corrupted, uninitialized, freed, or otherwise used incorrectly — not that spin_lock_irqsave() is inherently broken.

5. Common Causes of `spinlock bad magic`

Uninitialized lock — using the lock before calling spin_lock_init() or without static init.
Use-after-free — object containing the lock freed while other contexts still access it.
Memory corruption — buffer overflow/underflow or stack corruption overwrote lock bytes.
Mixing variants — inconsistent use of spin_lock(), spin_lock_irqsave(), spin_lock_bh() on the same lock.
Reinitializing while in use — race between init/free and use.
Wrong unlock or wrong pointer — unlocking a wrong object or passing an invalid pointer to unlock.
Incorrect pairing of irqsave/irqrestore — corrupt or reused flags variable, restoring bad flags.

6. Relation Between `spin_lock_irqsave()` and `spinlock bad magic`

spin_lock_irqsave() internally calls the spinlock acquisition path which validates the lock structure. If that validation fails (e.g., magic != expected), you'll see the "bad magic" message. In short:

irqsave doesn't create magic corruption — but it will detect it.
If the lock object is uninitialized or corrupted, any acquisition variant (spin_lock(), irqsave, etc.) will report it.

7. Correct vs Incorrect Usage Examples

Incorrect (causes "bad magic")

struct device_data *dev = kzalloc(sizeof(*dev), GFP_KERNEL);
/* Forgot: spin_lock_init(&dev->lock); */
unsigned long flags;
spin_lock_irqsave(&dev->lock, flags);   // BAD: lock.magic == 0

Correct

struct device_data *dev = kzalloc(sizeof(*dev), GFP_KERNEL);
spin_lock_init(&dev->lock);
unsigned long flags;
spin_lock_irqsave(&dev->lock, flags);
/* short, non-sleeping critical section */
spin_unlock_irqrestore(&dev->lock, flags);

If your critical section might sleep, use a mutex (or other sleeping lock) instead of a spinlock.

8. Debugging `spinlock bad magic`

When you see the message, follow these steps:

Read dmesg — the kernel prints the offending lock address and stack trace.
Map the address to a variable using addr2line / System.map.
Confirm initialization: ensure spin_lock_init() or static init was called.
Check object lifetime: ensure the container object wasn't freed (use kasan / kmemleak).
Audit usages: verify consistent API usage (irqsave vs plain vs bh) and pairing of save/restore flags.
Enable debugging options: compile kernel with CONFIG_DEBUG_SPINLOCK=y and CONFIG_LOCKDEP=y.
Run sanitizers: KASAN and kmemleak help catch corruptions and use-after-free.

9. Developer Checklist

Always call spin_lock_init() for dynamically allocated locks (or use DEFINE_SPINLOCK() for statics).
Never access a lock after the containing memory has been freed.
Avoid reinitializing locks while they may still be used by other CPUs.
Keep critical sections extremely short and do not sleep while holding spinlocks.
Use irqsave consistently if the lock is shared with interrupts.
Prefer mutexes in purely process context when sleeping is possible.
Compile with debug options and test under heavy concurrency and sanitizers.

10. Diagrams

11. Appendix: Quick reference & links

Quick rules

Variant	Disables	Use when
`spin_lock()`	Nothing	Process-only, non-IRQ shared data
`spin_lock_irqsave()`	Local IRQs (save flags)	Shared with IRQ handlers
`spin_lock_bh()`	Bottom halves / softirqs	Shared with tasklets/softirqs

Useful kernel debug flags

Enable during development:

CONFIG_DEBUG_SPINLOCK=y
CONFIG_LOCKDEP=y
KASAN / kmemleak for memory errors

RootTrust Labs Tech Journal

Friday, October 31, 2025

spin_lock_irqsave() and 'spinlock bad magic' in Linux Kernel

spin_lock_irqsave() and "spinlock bad magic" — Comprehensive Guide

1. Overview

2. Purpose of `spin_lock_irqsave()`

3. When to use `spin_lock_irqsave()`

4. Understanding `"spinlock bad magic"`

5. Common Causes of `spinlock bad magic`

6. Relation Between `spin_lock_irqsave()` and `spinlock bad magic`

7. Correct vs Incorrect Usage Examples

8. Debugging `spinlock bad magic`

9. Developer Checklist

10. Diagrams

11. Appendix: Quick reference & links

Useful kernel debug flags

No comments:

Post a Comment

Friday, October 31, 2025

spin_lock_irqsave() and 'spinlock bad magic' in Linux Kernel

spin_lock_irqsave() and "spinlock bad magic" — Comprehensive Guide

1. Overview

2. Purpose of spin_lock_irqsave()

3. When to use spin_lock_irqsave()

4. Understanding "spinlock bad magic"

5. Common Causes of spinlock bad magic

6. Relation Between spin_lock_irqsave() and spinlock bad magic

7. Correct vs Incorrect Usage Examples

8. Debugging spinlock bad magic

9. Developer Checklist

10. Diagrams

11. Appendix: Quick reference & links

Useful kernel debug flags

No comments:

Post a Comment

2. Purpose of `spin_lock_irqsave()`

3. When to use `spin_lock_irqsave()`

4. Understanding `"spinlock bad magic"`

5. Common Causes of `spinlock bad magic`

6. Relation Between `spin_lock_irqsave()` and `spinlock bad magic`

8. Debugging `spinlock bad magic`