Sunday, March 13, 2016

Working with PhotoScan in the cloud

Back in 2012 when I first started flying drones to make high-resolution photomaps (e.g., strapping a first-generation GoPro to the bottom of a balsa-wood DIY drone and hoping for the best), there were few options for processing the photos.

Basically, if you didn't have access to $2,000 software, you only had Microsoft Image Composite Editor (ICE) to stitch together the photos into mosaics. Fortunately, much has changed since then.

In a window of just two years, a number of software solutions became available. VisualFSM brought free, open-source photogrammetry to tech-savvy hobbyists and researchers. There was Autodesk's 123D catch, which could be used with drone imagery in a pinch. Pix4D came about in 2011, which later gained a huge market share in the professional UAS space. I won't get into all the options, but there's a fairly comprehensive table on Wikipedia that you might wish to look at.

The solution I use most often today is Agisoft PhotoScan. The feature set of the standard version is somewhat limited compared to solutions designed specifically for UAS use, but it's also easy to use, the software license is comparatively cheap, and it runs on ordinary desktop machines.

Many photogrammetry services are run in the cloud (123D catch, Pix4D, DroneMapper), which has its benefits. You don't have to upgrade your machine to run complex models. You don't have to tie up a computer for hours while it's processing 500-1,000 photos. You can start a job in another country, send your images to the cloud instance, and by the time you arrive back home, your job can be done.

But processing in the cloud can mean paying fees by the month or by the job. If you like paying a one-time fee for a license, the cloud may not be the most attractive solution.

Thankfully, PhotoScan can be run in the cloud. While it does mean incurring hourly fees for computer time and cloud storage, it can also help in a pinch when you're working on an especially large project.

Lately, I've been making use of Amazon Web Services (AWS) Elastic Compute Cloud (EC2). It's a cloud hosting service for on-demand computing, and it's pretty spiffy. It's as simple as launching an instance, accessing the instance through a Windows Remote Desktop Connection, setting up the job, and waiting for it to complete.

AWS offers many types of instances, but the ones you want to use for PhotoScan are the G2 instances. The base G2 instance, g2.2xlarge, uses a NVIDIA GRID K520 GPU and gives you 15 GB of RAM.

That's just fine for some projects, but for larger projects (500-1,000 photos), you'll want to splash out on a g2.8xlarge instance. That gives you four GRID GPUs and 60 GB of RAM to work with. That's likely more than what you've got at home.

You'll have your choice of operating systems to run on these instances. PhotoScan does come in a Linux version, and if you know a thing or two about Linux you can probably tweak more performance out of your instance by changing a few settings.

Personally, I stick to G2 instances that are loaded with Windows Server 2012 R2. It's much easier to get up and running, if you're already used to working with PhotoScan on a Windows machine. You can copy any file on your local machine and paste it into the Remote Desktop just like anything else.

Amazon provides some fairly good video tutorials on how to work inside the EC2 service, so I won't get into that. But I will provide some tips from what I've learned from using PhotoScan inside G2 instances.

Create an AMI of your own

After you set up your instance exactly the way you want it (installing the K520 driver and PhotoScan), exit your remote desktop connection and create an image of the instance (an AMI). This way, instead of having to copy over and install those items each time you want to run an instance, you already have a custom image ready to go. This will save you time and money.

In "Instances," right click the instance that you want to image, and click "image > create image" from the drop-down menu. Name it something memorable. If you create a g2.2 AMI, you can actually launch this on g2.8 hardware, so there's no reason to have two separate AMIs (I've actually had difficulty re-launching an AMI created in a g2.8 instance on g2.8 hardware for some reason -- haven't fount out why just yet).

The EBS is your friend

Keep your work out of the root drive, and keep it in an Elastic Block Store (EBS) Volume, so that you can work with the volume in both G2 and G8 instances. The first time you create a G2 instance, add one storage drive (an EBS volume) in addition to the root drive you're given. Store your PhotoScan images and files in this volume; not the root drive.

If you go on to launch a g2.8 instance, create the instance with the root drive only. Don't create any additional EBS volumes. Simply launch your instance, and after it is running, add the volume you created with the G2 instance. You can do this by going to "Volumes" under "Elastic Block Storage" in the EC2 dashboard, selecting the volume you created with the G2 instance, and choosing "Attach Volume." Your volumes will have to be in the same "availability zone" as your instances.

After attaching, you'll have to go inside your instance's Disk Management (right click the Windows start button, select "Disk Manager") to mount the volume. Right click the volume, and select "Online" to mount the volume. You're now able to work with all the files you worked with in the G2 instance, but now on a much faster machine.

One last thing about storage. An instance will not only create EBS volumes to store root drives and other storage, it will also create snapshots of those volumes. A snapshot is a copy of a volume at a point in time. This is great if something happens to an image or a volume -- you can restore it right up to when you made the snapshot. But the snapshots also take up just as much space as the volumes they were taken from, and you are billed for this space just like the volumes.

If you're working on any small-scale, short-term projects, you might be willing to save a few bucks by deleting some of those snapshots. In your EC2 dashboard, go to "Snapshots" under "Elastic Block Store" to check your current snapshot situation and clean house if need be.

Set your instance to shut down automatically

If you're running a large project that could take many hours, you probably don't want to babysit the project for the whole time. I mean, that's a big reason why you wanted this computed in the cloud, right? So set up a CloudWatch alarm that will stop your instance after PhotoScan is done processing.

The most simple way of doing this is creating an alarm that stops your instance when the CPU utilization falls below 10 percent for a certain period of time. Set this by selecting an instance, and click the "Monitoring" tab. Then click "Create Alarm."

Keep in mind that you can't stop an instance unless it is backed by an EBS volume, which usually means you'll have to make an image of your instance first. So resist the urge to dive into a PhotoScan project before first making an image of your instance.

Whenever your instance is stopped, your billing stops as well. But keep in mind that every time your instance changes from "stopped" to "running," you get billed one hour of compute time. Don't create scenarios where your image is constantly stopping and restarting. If you're setting an alarm based on CPU utilization, make sure you're done fiddling around with settings and PhotoScan before you set the alarm - lest your alarm go off before you've started anything.

Setting an alarm can be tricky. When you set an alarm, AWS doesn't just start counting time and CPU utilization then. It considers the time leading up to when you created the alarm. So if your CPU has been idling for the past 10 minutes, and you set an alarm to stop the instance if it's been idling for 10 minutes, you can bet your instance probably will be stopped right after you set the alarm. Aim for a longer window, and make sure you're not going to trigger the alarm immediately after you create it.

If you are setting an alarm, you're probably also going to set PhotoScan to run a batch of processes all at once (aligning, dense cloud, mesh, texture, etc). Make sure PhotoScan is automatically saving after each of these steps. Your progress will be saved in the EBS volume you're using in the instance. If your instance gets stopped and nothing is saved, your progress may be lost, and you'll have to start all over again.

If your instance is stopped, you can re-start it by right-clicking the instance and choosing "Instance State > Start." However, if you're about to re-start a stopped instance to take a look at your PhotoScan results, delete the previous alarm so your instance won't stop while you're trying to touch-up the product.

What kind of performance can I expect?

I tested the performance of both G2 instances against my home computer, which a machine with an AMD FX-8320 8-core, 3.5 GHz processor, with 16 GB RAM, and an ATI Radeon R7 260X GPU. The job was a 10.2 MP, 58-image scan of a dinosaur fossil. PhotoScan processing settings were the stock settings from a fresh install.

In terms of total processing time, from alignment to texture, g2.8xlarge instance bested the desktop by 22%, and beat the g2.2xlarge by 33%. In sparse cloud construction, the g2.2xlarge instance processed 116.764 million samples/sec, whereas the desktop processed 292.776 million samples/sec, and the g2.8xlarge processed 436.901 million samples/sec.

There was a catch, however. The g2.2xlarge instance was $0.65 an hour to run, while the g2.8xlarge instance was $2.60 an hour. So while the latter completed the job sooner, the former cost less overall. The g2.2xlarge cost $0.46 to complete the model, while the beefier g2.8xlarge cost $1.23 (not including setup costs, time it takes to transfer files into and from Remote Desktop, etc.).

I haven't yet collected enough data to explain how this scales up with larger image sets. It could very well be that g2.8xlarge is more cost-effective if you are running very large sets of images. After all, it has 60 GB of RAM to play with. In any event, I'll let you know what I find out.

Here's the dinosaur fossil, by the way: