You need a UFO mesh and a typical rendering software. The next you need is a typical bitmap painting software.
You create the scene in the rendering software (for example Povray) and render the mesh against a plain background (including giving you an alpha map)
Next you mask the palm from the first frames and include the rendered UFOs into it. It is really hard to find evidence for being made with CGI when the video is well made, compressed high enough to make it blurry and there is no interaction with the scenery (for example light shining on the ground).
The technology is the same like you had in Jurassic Park, but now, you can have the software at home. But it still takes some time to mask out objects from the real environment capture and process the video frame by frame.
That's most likely why the scene where the UFO was partially obstructed by the palm is not even a second long - it's 15 images to process, which can take a few hours to get right.
The software for making realistic looking jet engine flames BTW, is also available for free now...