I found the documentation of configuration and why certain things are set up a specific way a bit lacking on the new vSphere support from TeamCity, so here we do a dive into how everything works.
I’m going to assume you have a fairly good grasp on what TeamCity is and how to manage it, if you feel I’ve skipped anything and should go into better detail drop me a message either through the site’s contact page or a comment here.
TeamCity Cloud Integration
TeamCity’s cloud integration allows you to move your build agents from machines you may have online all day to a base image you clone out and spin up as required (and as many as required in whatever combination).
- You’re not limited to 3 build agent configurations with the base TeamCity install, you’re limited to 3 active at any one time. Good for multi-platform environments.
- Resources are only used when needed.
- Every build can be a clean build (if you trash your build agents after a build).
- Each virtual machine being a single build agent vs multiple build agents on a single machine with multiple cores may waste resources.
- Build times will increase due to virtual machine build time.
You need to move to your build agents being stateless.(I’m dumb, Jody Shumaker corrects me in the comments)
JetBrains has published the plugins required to integrate with vSphere on Github here: https://github.com/JetBrains/teamcity-vmware-plugin/. I’m very thankful they’ve open sourced this because of the hangup mentioned later with the required resource pools.
Preparing Your Base VM
- Install the OS of your choice (so far Windows and *nix environments are supported out of the box, plugin needs updating if you want to support more).
- Install VMware tools.
- This is used by TeamCity cloud to properly configure your build agent, and is required.
- Install all of your build tools
- Install Java for TeamCity build agent (if it isn’t part of your build tools).
- Install TeamCity build agent.
- Verify it shows up in TeamCity’s unauthorized agents list, check your agent parameters and compatible configurations.
- Shut down TeamCity build agent service on build agent virtual machine.
- May want to remove the build agent from the unauthorized list at this point just to clean things up, but this is up to you.
- Remove name, serverURL and authorizationToken from conf/buildAgent.properties on the build agent.
- This is to make your image generic, your cloud plugin and VMware tools will auto-populate these values for you, and in the event of you wanting to tweak the base virtual machine you don’t have to worry about it booting up as a valid build agent.
- Shut down the VM, and snapshot it (without a snapshot your virtual infrastructure will try to clone the entire VM, this will make spin up times for build agents extremely high.
In the most recent version of VMware on some Linux platforms when you go to install VMware tools you’ll be told to use open-vm-tools instead. At least on Debian 7 (and I haven’t tested other platforms) the vmware-rpctool binary ends up in /usr/bin instead of /usr/sbin like TeamCity expects. So we’ll just make a link for it:
$> ln /usr/bin/vmware-rpctool /usr/sbin/vmware-rpctool
I have an open issue with JetBrains on it to update their documentation.
Update: No longer an issue on the latest patch, quick turn-around from JetBrains!
Configuring Your Cloud
Go to Administration > Agent Cloud (under Server Administration) and add a new profile and give it a name.
I like to use a dedicated account for vSphere that has access only to specific folder in our environment, this includes network and datastores to prevent it from doing anything to the rest of the infrastructure.
- Terminate instance idle time – This is the time a virtual machine that is spun up will wait without a task before being shut down, I set this fairly low (10 minutes).
- Terminate instance (after first build completed) – This will trash a virtual machine after the build process is complete, I keep this on because I find this to be one of the major reason I’m doing this.
- Cloud type (set to VMware vSphere)
- vCenter SDK URL – You’ll set this to https://[vCenter FQDN]/sdk, it accepts self-signed certs provided by your vCenter box by default just fine.
Next we’ll need to configure some images, click “Add Image”, select a virtual machine from the pull-down list, pick a snapshot (you’re using snapshots, right?), select a folder that the clones will go into, select a resource pool (more on that later) and set the maximum number of virtual machines you want to be able to run at once.
If you don’t have resource pools because you’re on a version of vSphere that doesn’t support them, you’ll either have to wait till I get my fork done that removes this requirement and have JetBrains pull or, or download and remove the requirement yourself (I hear an older version of the plugin doesn’t require it but haven’t verified it).
Update: JetBrains beat me to it, issued a patch, works great now.
Verification And Validation
When you build a project with zero build agents installed, it’ll go into the queue. The TeamCity cloud plugin will spin up a virtual machine for you, and when that machine is registered it should show up with the name of the virtual machine. If this is what you got, success!
If not, feel free to post comments and I’ll see if I can help out.
“You need to move to your build agents being stateless.”
This is not true. You can configure an instance to use the instance as-is. It’ll just start/stop that instance as needed, keeping all state between each start/stop.
Thanks for helping out with removing the resource pool requirement, I see they did pull that in and the open-vm-tools fix, but haven’t marked those builds as deploy yet. Also, something I’ve noticed and reported, using a low idle time can run into an issue where the agent hits a bug where it sits there doing nothing for 20 minutes, before finishing it’s startup. Right now, it starts idle time from the instant it starts the agent, not from when the agent connects to the server so with a 10 minute idle time you can run into a scenario where the instance starts, never connects, hits 10 minute idle, gets shut down. This then repeats until eventually agent starts up faster than idle time. Doesn’t happen all the time though, but if you see your agents cycling and never running anything, it might be the cause.
I’ve updated the post, completely forgot that you can just have it power on/off virtual machines too so you can have stateful virtual machines.
They actually wrote their own resource pool fix before I did, JetBrains was super fast to react to my issues and beat me to the punch!
Ouch, I think if I had a 10 minute start time our developers may come after me with pitch forks! The additional minute or two it can take has raised a lot of upset mumbling (even if we aren’t chasing any more issues caused by virtual machine state). Though I know connect times can increase drastically if you’re using stateless virtual machines and you’re not upgrading the agents as you patch TeamCity and add plugins.
This is unfortunately a bug in the vSphere agent plugin, not sure on specifics as I haven’t had time to debug, but during agent startup it sits and does nothing for 20 minutes, then continues start up as normal. Thankfully doesn’t happen often, but very frustrating when it does. Hopefully they can figure out a fix soon.
I am trying to set up teamcity to start statefull agents. I can start the agents manually from adminnistration/Agent Cloud UI in teamcity and the vSphere plugin is able to shut them down after a build timeout. I am having a problem getting the vSphere plugin to start the agents. Has anyone ran into this?
I’ve run into a few different things
Still can’t get the builds to start automatically. I am using the free version of TeamCity to mock up a continuous build server in the cloud. For the bullets:
1. I have manually started a VM after setting the serviceURL. I also added it to the cloud profile to start and stop the image. In the default pool it appears twice. Once as “agent2” and again as “vmWare cloud profile, agent2”. The “agent2” Initially says “Not authorized”. Going to the Cloud, agent2 has a red “(!)” next to it. Hovoring over it says:
“No agents connected after instance start. Please check the image has TeamCity agent configured and it can connect to the server using http://loadtestteamcity:8111 address. Start the instance manually to check for agent again.”
I have tried to authorize agent2 and run a build on it manually and it works. When it is shutdown it will not start it. The red “(!)” still shows up on the cloud tab.
2. I only have three agents. The default agent (I have tried disabling and unauthorizing) , agent2 and agent3 (agent3 is another VM that has the same problem as agent2)
3. Just using the default pool.
I could use a mental check. Anyway, you could join a webex?
YES! I got it I was missing the link you mentioned. I assumed that since I just installed that I didn’t need that due to the comments above “Update: No longer an issue on the latest patch, quick turn-around from JetBrains!”. I guess it is still a patch versus in the production release. But that did the trick. Thanks doing this post.
Awesome, glad you figured it out.
love this howto…precise but short. Some update: your Issue at JB has been fixed, thus no linking necessary anymore.
Additionally maybe something to point out are the necessary access rights at vCenter as it is not recommended to have just administrator rights assigned?
Some P.S.: Do you have any experience with the latest Agent Push / Installation feature on Cloud profiles? Somehow I don’t managed to get this running with vCenter VMs. They are starting but Agent Push seems not to be initialized or at least not successful.
I messed around with it briefly but mostly found it not worth it since keeping an agent ready-to-go reduced time-to-ready time on build agents. If I had a very specific requirement for it I would mess around with it more but pre-warmed agents bring build times down *a lot* (we typically will do stuff like pull docker image chains and whatnot on the base image)
I may have the rights sitting around somewhere, I had constrained my system to only be able to deploy into a specific folder and whatnot (since we have two TeamCity installs playing with the same vCenter server).