Solving problems in computational biology using the Micron Automata Processor
ABSTRACT
Micron Automata Processor is an exciting new re-configurable streaming co-processor architecture which can be used to do parallel pattern-matching on large data streams using direct implementation of Non-deterministic Finite Automata (NFA). Information about this technology was publicly released in November 2013 and the chip production is set to start in early 2014. We have been working on this technology through ea rly resources provided by Micron. Our investigations in trying to use this co-processor to solve complex problems in computational biology revealed that this processor can be used to greatly accelerate some generic graph processing algorithms.
In this talk, we will be discussing this new co-processor architecture and its use in finding approximately conserved sequences, called motifs, across multiple DNA or protein sequences, which is an important problem in computational biology. We consider the (l, d) motif search problem of identifying one or more motifs of length l present in at least q of the n given sequences, with e ach occurrence differing from the motif in at most d substitutions. The problem is known to be NP-hard, and the largest solved instance reported to date is (26, 11). We propose a novel algorithm for the (l, d) motif search problem using streaming execution over a large set of Non-deterministic Finite Automata (NFA). We estimate the run-time for the (39, 18) and (40, 17) problem using the resources available within a single Automata Processor board.
Index Terms—computational biology; finite automaton; graph algorithms; hardware acceleration; motif detection.