Here we catch the bookmark events thrown by the speech engine when it encounters a bookmark tag
in the speech stream. We interpret the argument of the tag as the name of an animation stored
in our animation hash file, using that name as the key to retrieve the animation data (which is
an arraylist of byte arrays acting as animation time "slices")