Troubleshooting

To get an idea about the inner workings of Concurrently have a look at the Flow of control section in the overview.

An evaluation is scheduled but never run

Consider the following script:

#!/bin/env ruby

concurrently do
  puts "I will be forgotten, like tears in the rain."
end

puts "Unicorns!"

Running it will only print:

Unicorns!

concurrently{} is a shortcut for concurrent_proc{}.call_detached which in turn does not evaluate its code right away but schedules it to run during the next iteration of the event loop. But, since the main evaluation did not await anything the event loop has never been entered and the concurrent evaluation has never been started.

A more subtle variation of this behavior occurs in the following scenario:

#!/bin/env ruby

concurrently do
  puts "Unicorns!"
  wait 2
  puts "I will be forgotten, like tears in the rain."
end

wait 1

Running it will also only print:

Unicorns!

This time, the main evaluation does await something, namely the end of a one second time frame. Because of this, the evaluation of the concurrently block is indeed started and immediately waits for two seconds. After one second the main evaluation is resumed and exits. The concurrently block is never awoken again from its now eternal beauty sleep.

A call is blocking the entire execution.

#!/bin/env ruby

r,w = IO.pipe

concurrently do
  w.write 'Wake up!'
end

r.readpartial 32

Here, although we are practically waiting for r to be readable we do so in a blocking manner (IO#readpartial is blocking). This brings the whole process to a halt, the event loop will not be entered and the concurrently block will not be run. It will not be written to the pipe which in turn creates a nice deadlock.

You can use blocking calls to deal with I/O. But you should await readiness of the IO before. If instead of just r.readpartial 32 we write:

r.await_readable
r.readpartial 32

we suspend the main evaluation, switch to the event loop which runs the concurrently block and once there is something to read from r the main evaluation is resumed.

This approach is not perfect. It is not very efficient if we do not need to await readability at all and could read from r immediately. But it is still better than blocking everything by default.

The most efficient way is doing a non-blocking read and only await readability if it is not readable:

begin
  r.read_nonblock 32
rescue IO::WaitReadable
  r.await_readable
  retry
end

The event loop is jammed by too many or too expensive evaluations

Let's talk about a concurrent evaluation with an infinite loop:

evaluation = concurrently do
  loop do
    puts "To infinity! And beyond!"
  end
end

concurrently do
  evaluation.conclude_to :cancelled
end

When the loop evaluation is scheduled to run it runs and runs and runs and never finishes. The event loop is never entered again and the other evaluation concluding the evaluation is never started.

A less extreme example is something like:

concurrently do
  loop do
    wait 0.1
    puts "timer triggered at: #{Time.now.strftime('%H:%M:%S.%L')}"
    concurrently do
      sleep 1 # defers the entire event loop
    end
  end
end.await_result

# => timer triggered at: 16:08:17.704
# => timer triggered at: 16:08:18.705
# => timer triggered at: 16:08:19.705
# => timer triggered at: 16:08:20.705
# => timer triggered at: 16:08:21.706

This is a timer that is supposed to run every 0.1 seconds and creates another evaluation that takes a full second to complete. But since it takes so long the loop also only gets a chance to run every second leading to a delay of 0.9 seconds between the time the timer is supposed to run and the time it actually ran.

Forking the process causes issues

A fork inherits the main thread and with it the event loop with all its internal state from the parent. This is a problem since fibers created in the parent process cannot be resume in the forked process. Trying to do so raises a "fiber called across stack rewinding barrier" error. Also, we probably do not want to continue watching the parent's IOs.

To fix this, the event loop has to be reinitialized directly after forking:

fork do
  Concurrently::EventLoop.current.reinitialize!
  # ...
end

# ...

While reinitializing the event loop clears its list of IOs watched for readiness, the IOs themselves are left untouched. You are responsible for managing IOs (e.g. closing them).

Errors tear down the event loop

Every evaluation rescues the following errors: NoMemoryError, ScriptError, SecurityError, StandardError and SystemStackError. These are all errors that should not have an immediate influence on other evaluations or the application as a whole. They will be rescued and do not leak to the event loop and thus will not tear it down.

All other errors happening during an evaluation will tear down the event loop. These error types are: SignalException, SystemExit and the general Exception. In such a case the event loop exits by re-raising the causing error.

If your application rescues the error when the event loop is teared down and continues running (irb does this, for example) it will do so with a reinitialized event loop.

Using Plain Fibers

In principle, you can safely use plain ruby fibers alongside concurrent procs. Just make sure you are exclusively operating on these fibers to not accidentally interfere with the fibers managed by Concurrently. Be especially careful with Fiber.yield and Fiber.current inside a concurrent evaluation.

Fiber-local variables are treated as thread-local

In Ruby, Thread#[], #[]=, #key? and #keys operate on variables local to the current fiber and not the current thread. This behavior is not noticed most of the time because people rarely work explicitly with fibers. Then, each thread has exactly one fiber and thread-local and fiber-local variables behave the same way.

But if fibers come into play and a single thread starts switching between them, these methods cause errors instantly. Since Concurrently is built upon fibers it needs to sail around those issues. Most of the time the real intention is to set variables local to the current thread; just like the receiver of said methods suggests. For this reason, Thread#[], #[]=, #key? and #keys are boldly redirected to Thread#thread_variable_get, #thread_variable_set, #thread_variable? and #thread_variables.

If you belong to the ones using fibers with variables indeed intended to be fiber-local, you have two options: 1) Don't use Concurrently or 2) change all these fibers to concurrent procs and use their evaluation's data store to store the variables.

fiber = Fiber.new do
  Thread.current[:key] = "I intend to be fiber-local!"
  puts Thread.current[:key]
end

fiber.resume

becomes:

conproc = concurrent_proc do
  Concurrently::Evaluation.current[:key] = "I'm evaluation-local!"
  puts Concurrently::Evaluation.current[:key]
end

conproc.call

FiberError: mprotect failed

Each concurrent evaluation runs in a fiber. If your application creates more concurrent evaluations than are concluded, more and more fibers need to be created. At some point the creation of additional fibers fails with "FiberError: mprotect failed". This is caused by hitting the limit for the the number of distinct memory maps a process can have. The corresponding linux kernel parameter is /proc/sys/vm/max_map_count and has default value of 64k. Each fiber creates two memory maps leading to a default maximum of around 30k fibers. To create more fibers the max_map_count needs to be increased.

$ sysctl -w vm.max_map_count=65530

See also: https://stackoverflow.com/a/11685165/3323185