Rust源码分析:crossbeam之ms_queue(1)
name = "crossbeam"
version = "0.4.1"
crossbeam提供了一系列的并发数据结构。本文详细剖析其中的并发队列,也就是经典的msqueue。当然,由于Rust没有垃圾回收器,所以crossbeam需要在并发环境下,回收堆上的内存。这一部分任务主要由crossbeam-epoch来完成。由于篇幅所限,本文介绍构造msqueue的数据结构与算法,其他文章介绍crossbeam-epoch。
MsQueue<T>
我们来看构造函数,
#[derive(Debug)]
pub struct MsQueue<T> {
head: CachePadded<Atomic<Node<T>>>,
tail: CachePadded<Atomic<Node<T>>>,
}
impl<T> MsQueue<T> {
/// Create a new, empty queue.
pub fn new() -> MsQueue<T> {
let q = MsQueue {
head: CachePadded::new(Atomic::null()),
tail: CachePadded::new(Atomic::null()),
};
let sentinel = Owned::new(Node {
payload: Payload::Data(unsafe { mem::uninitialized() }),
next: Atomic::null(),
});
let guard = epoch::pin();
let sentinel = sentinel.into_shared(&guard);
q.head.store(sentinel, Relaxed);
q.tail.store(sentinel, Relaxed);
q
}
可以看到一个MsQueue包含{head, tail},同时head跟tail都作为CachePadded存在,而一个CachePadded包含一个Atomic。
再然后我们构造了一个sentinel,作为Owned存在。Owned里面包含节点Node。而Node里面包含了payload(大概是我们要存放的数据),next应该是指向下一个的节点:当然这个是Atomic。
"let guard = epoch::pin()",这一行是为了配合内存回收的,我们暂时不关注。
接着的是
let sentinel = sentinel.into_shared(&guard);
sentinel 意思是哨兵,应该只是用于构造初始化队列,我们来看看这一行具体做了什么:
Owned<T>里面有个into_shared方法:
impl<T> Owned<T> {
.....
pub fn into_shared<'g>(self, _: &'g Guard) -> Shared<'g, T> {
unsafe { Shared::from_usize(self.into_usize()) }
}
}
简单点说就是从一个Owned<T>得到了一个Shared<'g, T>,顾名思义,这里似乎把一个独享数据变为共享数据吧?我们再看看Owned.into_usize方法以及Owned本身的结构:
/// A trait for either `Owned` or `Shared` pointers.
pub trait Pointer<T> {
/// Returns the machine representation of the pointer.
fn into_usize(self) -> usize;
/// Returns a new pointer pointing to the tagged pointer `data`.
unsafe fn from_usize(data: usize) -> Self;
}
pub struct Owned<T> {
data: usize,
_marker: PhantomData<Box<T>>,
}
impl<T> Pointer<T> for Owned<T> {
#[inline]
fn into_usize(self) -> usize {
let data = self.data;
mem::forget(self);
data
}
.......
}
也就是说Owned<T>其实只是包含一个data:usize(可作为地址)和一个_marker(PhantomData)。而into_usize方法也只是返回了data。
这里的mem::forget(self),这是主要是禁止执行self对象的drop方法。
原因在于我们把回收Node节点堆内存的工作放到了Owned对象的drop方法里。
这部分内容,在分析crossbeam-epoch时详细说明。
我们再来看看Shared::from_usize方法与Shared本身的结构:
pub struct Shared<'g, T: 'g> {
data: usize,
_marker: PhantomData<(&'g (), *const T)>,
}
impl<'g, T> Pointer<T> for Shared<'g, T> {
#[inline]
fn into_usize(self) -> usize {
self.data
}
#[inline]
unsafe fn from_usize(data: usize) -> Self {
Shared {
data: data,
_marker: PhantomData,
}
}
}
所以我们可以看到,Shared与Owned几乎一模一样,除了shared多了个类型参数'g。
而这里的from_usize也只是从一个usize构造了一个Shared。
综合起来"into_shared",只是从一个Owned本身传递一个data,然后基于它构造了一个Shared,同时这个Owned跟Shared结构完全一样除了类型参数。
接着看下面的代码:
let sentinel = sentinel.into_shared(&guard);
q.head.store(sentinel, Relaxed);
q.tail.store(sentinel, Relaxed);
q
这里的意思很明显了,就是把head和tail设置为sentinel。我们来看看实际的store做了什么:
pub fn store<'g, P: Pointer<T>>(&self, new: P, ord: Ordering) {
self.data.store(new.into_usize(), ord);
}
Atomic的结构是:
pub struct Atomic<T> {
data: AtomicUsize,
_marker: PhantomData<*mut T>,
}
所以这里实质上只是把Owned或者Shared(注意他们俩都是实现了Pointer)含有的usize,放到自己的AtomicUsize里面,从而具备了原子操作的能力 head 和 tail。实际上还是拿到这个地址而已。(usize本质是Node的地址)。
好了,我们接着看MSQueue提供的操作:pub fn push(&self, t: T) 和 pub fn pop(&self) -> T。
pub fn push(&self, t: T)
/// Add `t` to the back of the queue, possibly waking up threads
/// blocked on `pop`.
pub fn push(&self, t: T) {
/// We may or may not need to allocate a node; once we do,
/// we cache that allocation.
enum Cache<T> {
Data(T),
Node(Owned<Node<T>>),
}
impl<T> Cache<T> {
/// Extract the node if cached, or allocate if not.
fn into_node(self) -> Owned<Node<T>> {
match self {
Cache::Data(t) => Owned::new(Node {
payload: Payload::Data(ManuallyDrop::new(t)),
next: Atomic::null(),
}),
Cache::Node(n) => n,
}
}
/// Extract the data from the cache, deallocating any cached node.
fn into_data(self) -> T {
match self {
Cache::Data(t) => t,
Cache::Node(node) => match (*node.into_box()).payload {
Payload::Data(t) => ManuallyDrop::into_inner(t),
_ => unreachable!(),
},
}
}
}
let mut cache = Cache::Data(t); // don't allocate up front
let guard = epoch::pin();
loop {
// We push onto the tail, so we'll start optimistically by looking
// there first.
let tail_shared = self.tail.load(Acquire, &guard);
let tail_ref = unsafe { tail_shared.as_ref() }.unwrap();
// Is the queue in Data mode (empty queues can be viewed as either mode)?
if tail_ref.is_data() || self.head.load(Relaxed, &guard) == tail_shared {
// Attempt to push onto the `tail` snapshot; fails if
// `tail.next` has changed, which will always be the case if the
// queue has transitioned to blocking mode.
match self.push_internal(&guard, tail_shared, cache.into_node()) {
Ok(_) => return,
Err(n) => {
// replace the cache, retry whole thing
cache = Cache::Node(n)
}
}
} else {
// Queue is in blocking mode. Attempt to unblock a thread.
let head_shared = self.head.load(Acquire, &guard);
let head = unsafe { head_shared.as_ref() }.unwrap();
// Get a handle on the first blocked node. Racy, so queue might
// be empty or in data mode by the time we see it.
let next_shared = head.next.load(Acquire, &guard);
let request = unsafe { next_shared.as_ref() }.and_then(|next| match next.payload {
Payload::Blocked(signal) => Some((next_shared, signal)),
Payload::Data(_) => None,
});
if let Some((blocked_node, signal)) = request {
// race to dequeue the node
if self.head
.compare_and_set(head_shared, blocked_node, Release, &guard)
.is_ok()
{
unsafe {
// signal the thread
(*signal).data = Some(cache.into_data());
let thread = (*signal).thread.clone();
(*signal).ready.store(true, Release);
thread.unpark();
guard.defer(move || head_shared.into_owned());
return;
}
}
}
}
}
}
首先是有一个数据结构Cache:
enum Cache<T> {
Data(T),
Node(Owned<Node<T>>),
}
在我们的用法中,它的主要意义是如果创建了一次Cache::Node,那么这个Owned<Node<T>>就会被我们保留着用,从而更节省。
接下来Cache提供了两个方法:into_node跟into_data。
impl<T> Cache<T> {
/// Extract the node if cached, or allocate if not.
fn into_node(self) -> Owned<Node<T>> {
match self {
Cache::Data(t) => Owned::new(Node {
payload: Payload::Data(ManuallyDrop::new(t)),
next: Atomic::null(),
}),
Cache::Node(n) => n,
}
}
/// Extract the data from the cache, deallocating any cached node.
fn into_data(self) -> T {
match self {
Cache::Data(t) => t,
Cache::Node(node) => match (*node.into_box()).payload {
Payload::Data(t) => ManuallyDrop::into_inner(t),
_ => unreachable!(),
},
}
}
}
从into_node可以看出假如本身已经是Cache::Node了,那么我们就不需要构造Owned了。
而into_data用于已经构造好了Cache对象,再从中抽取数据T的情况。
我们接下来进入具体的逻辑部分:
let mut cache = Cache::Data(t); // don't allocate up front
let guard = epoch::pin();
首先把数据 t 封装到一个Cache::Data里。
“let guard = epoch::pin();”跟内存回收相关的,我们先不考虑。
接下来是我们的loop主逻辑:
loop {
// We push onto the tail, so we'll start optimistically by looking
// there first.
let tail_shared = self.tail.load(Acquire, &guard);
let tail_ref = unsafe { tail_shared.as_ref() }.unwrap();
// Is the queue in Data mode (empty queues can be viewed as either mode)?
if tail_ref.is_data() || self.head.load(Relaxed, &guard) == tail_shared {
// Attempt to push onto the `tail` snapshot; fails if
// `tail.next` has changed, which will always be the case if the
// queue has transitioned to blocking mode.
match self.push_internal(&guard, tail_shared, cache.into_node()) {
Ok(_) => return,
Err(n) => {
// replace the cache, retry whole thing
cache = Cache::Node(n)
}
}
} else {
// Queue is in blocking mode. Attempt to unblock a thread.
let head_shared = self.head.load(Acquire, &guard);
let head = unsafe { head_shared.as_ref() }.unwrap();
// Get a handle on the first blocked node. Racy, so queue might
// be empty or in data mode by the time we see it.
let next_shared = head.next.load(Acquire, &guard);
let request = unsafe { next_shared.as_ref() }.and_then(|next| match next.payload {
Payload::Blocked(signal) => Some((next_shared, signal)),
Payload::Data(_) => None,
});
if let Some((blocked_node, signal)) = request {
// race to dequeue the node
if self.head
.compare_and_set(head_shared, blocked_node, Release, &guard)
.is_ok()
{
unsafe {
// signal the thread
(*signal).data = Some(cache.into_data());
let thread = (*signal).thread.clone();
(*signal).ready.store(true, Release);
thread.unpark();
guard.defer(move || head_shared.into_owned());
return;
}
}
}
}
}
注意,从主要的算法上来说,对于这个非阻塞的mpmc队列,由于多并发环境下的不确定性,所以假如执行完成的话,是通过一个原子操作(这里是compareAndSwap系列)来完成。失败的话,那么就重新读取最新数据,再试一次。所以这里外层有个loop。
稍微要说一下,作者实现的这个队列,并不是一个单纯的,每个调用都有返回的并发队列。而是主动加入了用户态调度 (park/unpark) 的接口。对于push接口而言没什么差别,因为push用于都是塞入一个数据。但是pop接口的话有两种,如下:
/// Attempt to dequeue from the front.
///
/// Returns `None` if the queue is observed to be empty.
pub fn try_pop(&self) -> Option<T> {
/// Dequeue an element from the front of the queue, blocking if the queue is
/// empty.
pub fn pop(&self) -> T {
从注释中我们可以看出:
- try_pop(&self)-> Option<T>,有数据则返回,或者没数据返回None。
- pop(&self)-> T,有数据则返回,或者等待,直到队列被塞入数据再返回(所以这里的返回值是T)。
那么这个究竟是怎么实现的呢?
简单点来说,陷入等待的pop调用会在等待之前塞入一个Node,这个节点不同于之前介绍的存有数据的Node,里面包含了pop线程的handler(thread)、判断该次调用可否返回数据(ready)、返回数据本身这些信息(data)。所以Node提供一个 is_data 方法来判断是否为数据,以下为相应的代码:
#[derive(Debug)]
struct Node<T> {
payload: Payload<T>,
next: Atomic<Node<T>>,
}
#[derive(Debug)]
enum Payload<T> {
/// A node with actual data that can be popped.
Data(ManuallyDrop<T>),
/// A node representing a blocked request for data.
Blocked(*mut Signal<T>),
}
/// A blocked request for data, which includes a slot to write the data.
#[derive(Debug)]
struct Signal<T> {
/// Thread to unpark when data is ready.
thread: Thread,
/// The actual data, when available.
data: Option<T>,
/// Is the data ready? Needed to cope with spurious wakeups.
ready: AtomicBool,
}
impl<T> Node<T> {
fn is_data(&self) -> bool {
if let Payload::Data(_) = self.payload {
true
} else {
false
}
}
}
这里的Payload::Data 给push使用,Payload::Blocked 给pop等待时使用。
同时Signal用于传递信息。
OK,说了这么一大堆,对于我们push来说,一定要区分队列中都是数据,或者队列中都是等待着的pop"调用",这两种情况。而这两种情况是通过tail节点来区分的。
我们来看代码:
// We push onto the tail, so we'll start optimistically by looking
// there first.
let tail_shared = self.tail.load(Acquire, &guard);
let tail_ref = unsafe { tail_shared.as_ref() }.unwrap();
先是通过tail(Atomic),来得到Shared对象。(如果是第一次调用的话,它应该是sentinel)。接着Shared通过as_ref()方法,以及unwrap方法:
pub fn as_raw(&self) -> *const T {
let (raw, _) = decompose_data::<T>(self.data);
raw
}
pub unsafe fn as_ref(&self) -> Option<&'g T> {
self.as_raw().as_ref()
}
从而得到了我们的T对象tail_ref,这里就是Node。
接下来是区分队列中究竟是存有数据(包含空队列的情况),还是已经有等待的pop调用:
// Is the queue in Data mode (empty queues can be viewed as either mode)?
if tail_ref.is_data() || self.head.load(Relaxed, &guard) == tail_shared {
..............
} else {
..............
}
Data Mode
我们先来看对应数据模式的情况:
// Attempt to push onto the `tail` snapshot; fails if
// `tail.next` has changed, which will always be the case if the
// queue has transitioned to blocking mode.
match self.push_internal(&guard, tail_shared, cache.into_node()) {
Ok(_) => return,
Err(n) => {
// replace the cache, retry whole thing
cache = Cache::Node(n)
}
}
注意这里终于调用了into_node方法,同时我也发现这是唯一调用into_node的地方。如果返回Ok(_)的话,忽略返回值,成功结束。如果返回Err(n)的话,我们要重新构造cache,并且重试,注意这里的n显然就是之前构造好的Owned<Node<T>>。
我们接着看push_internal方法:
/// Attempt to atomically place `n` into the `next` pointer of `onto`.
///
/// If unsuccessful, returns ownership of `n`, possibly updating
/// the queue's `tail` pointer.
fn push_internal(
&self,
guard: &epoch::Guard,
onto: Shared<Node<T>>,
n: Owned<Node<T>>,
) -> Result<(), Owned<Node<T>>> {
// is `onto` the actual tail?
let next_atomic = &unsafe { onto.as_ref() }.unwrap().next;
let next_shared = next_atomic.load(Acquire, guard);
if unsafe { next_shared.as_ref() }.is_some() {
// if not, try to "help" by moving the tail pointer forward
let _ = self.tail.compare_and_set(onto, next_shared, Release, guard);
Err(n)
} else {
// looks like the actual tail; attempt to link in `n`
next_atomic
.compare_and_set(Shared::null(), n, Release, guard)
.map(|shared| {
// try to move the tail pointer forward
let _ = self.tail.compare_and_set(onto, shared, Release, guard);
})
.map_err(|e| e.new)
}
}
这里的onto使我们的tail节点,n是我们构造的Owned<Node<T>>节点。
首先是
let next_atomic = &unsafe { onto.as_ref() }.unwrap().next;
我们直接得到next_atomic,也就是tail节点中的next这个Atomic属性。紧接着我们又通过next_atomic.load得到其中的Shared对象,注意假如这个next_atomic是由Atomic::null()构造的,那么目前Node对象还是不存在的。当然我们这里依然可以从next_atomic成功构造出next_shared对象:
let next_shared = next_atomic.load(Acquire, guard);
但是接下来这一行代码是重点:
if unsafe { next_shared.as_ref() }.is_some() {
...............
} else {
...............
}
next_shared对象在调用as_ref之后返回了一个Option对象,里面包裹着Node。
所以这里调用is_some()进行区分,假如Some<T>的话则说明已经被其他线程添加了下一个节点,所以我们只需要更新tail节点并且重试。假如None的话我们才能进行继续工作,看一下源代码:
if unsafe { next_shared.as_ref() }.is_some() {
// if not, try to "help" by moving the tail pointer forward
let _ = self.tail.compare_and_set(onto, next_shared, Release, guard);
Err(n)
} else {
// looks like the actual tail; attempt to link in `n`
next_atomic
.compare_and_set(Shared::null(), n, Release, guard)
.map(|shared| {
// try to move the tail pointer forward
let _ = self.tail.compare_and_set(onto, shared, Release, guard);
})
.map_err(|e| e.new)
}
这里的重点是Atomic.compare_and_set方法。同时我们看到了两种map的处理方式:
- map(...):用新的尾节点shared替换掉久的尾节点onto,因为shared已经被链接到onto的下一个节点,这件事情一定会完成。
- map_err(...):说明compare_and_set操作失败,"|e| e.new",此时我们将重新拿到之前构造的Owned<Node<T>>,并返回。
我们再来看Atomic.compare_and_set方法:
pub fn compare_and_set<'g, O, P>(
&self,
current: Shared<T>,
new: P,
ord: O,
_: &'g Guard,
) -> Result<Shared<'g, T>, CompareAndSetError<'g, T, P>>
where
O: CompareAndSetOrdering,
P: Pointer<T>,
{
let new = new.into_usize();
self.data
.compare_exchange(current.into_usize(), new, ord.success(), ord.failure())
.map(|_| unsafe { Shared::from_usize(new) })
.map_err(|current| unsafe {
CompareAndSetError {
current: Shared::from_usize(current),
new: P::from_usize(new),
}
})
}
注意,这里的current就是Shared::null(),作为我们的预期。这里的new就是我们的Owned<Node<T>>,而这里是采用了P: Pointer<T>。通过实现Pointer这个trait(Owned,Shared),能达到泛型的目的:
/// A trait for either `Owned` or `Shared` pointers.
pub trait Pointer<T> {
/// Returns the machine representation of the pointer.
fn into_usize(self) -> usize;
/// Returns a new pointer pointing to the tagged pointer `data`.
unsafe fn from_usize(data: usize) -> Self;
}
我们看到首先是:
let new = new.into_usize();
这里new作为usize,也就是Node的地址。
然后是:
self.data
.compare_exchange(current.into_usize(), new, ord.success(), ord.failure())
.map(|_| unsafe { Shared::from_usize(new) })
.map_err(|current| unsafe {
CompareAndSetError {
current: Shared::from_usize(current),
new: P::from_usize(new),
}
})
这里的self.data为标准库的AtomicUsize,通过compare_exchange方法来实现原子的比较并交换。所以这里的current.into_usize()跟new都作为usize存在,从而对两个数值进行比较,而且底层通过指令来实现原子操作。
多说一句,我们看下AtomicUsize.compare_exchange的返回类型:
pub fn compare_exchange(
&self,
current: usize,
new: usize,
success: Ordering,
failure: Ordering
) -> Result<usize, usize>
意思是返回的Result假如是Ok(value)的话,那么操作成功的同时value==current,也就是之前的预期值。假如是Err(value)的话,说明操作失败的同时,value包含了最新看到的值。
续。