Sharing hints, tips, experience, ideas and other cool stuff about Amazon Web Services

AWS BLOG

Read our Blog

Windows servers patching with AWS EC2 Systems Manager

 

Amazon EC2 Systems Manager is a collection of capabilities that helps you automate management tasks such as collecting system inventory, applying operating system patches, automating the creation of Amazon Machine Images (AMIs), and configuring operating systems and applications at scale. It is available at no cost to manage both your EC2 and on-premises resources!

Amazon EC2 Systems Manager relies on the Amazon Simple Systems Management Service (SSM) agent being installed on the guests. The SSM agent is pre-installed on Windows Server 2016 instances or Windows Server 2003-2012 R2 instances created from AMI’s published after November 2016. You need at least SSM agent version 2.0.599.0 installed on the target EC2 instance.

In this article we will focus on using Systems Manager to apply Windows Updates to EC2 instances. Patch Management is always an operational pain point so its welcome that AWS offers a solution.

You start by creating groups of instances by applying a tag called ‘Patch Group’. Then you create a group of patches by forming a patch baseline containing and excluding the patches you require (or use the AWS default patch baseline). At last you create a maintenance window to have your patch baseline attached and applied to a patch group. The actual ‘Patch Now’ run-command is nothing more than an API call, so there’s no obligation to use Maintenance Windows. Personally I’m a fan of Rundeck, so I’ll show you how to have the patches applied to the instances using both methods.

Configure your instances

The guest SSM agent setting inside with Windows OS requires permissions to connect to AWS EC2 Systems Manager. We grant these rights by creating an EC2 Service Role with the policy document ‘AmazonEC2RoleforSSM’ attached. Then you can attach this role to your instances. The instance also needs outbound internet connection to be able to connect to SSM. This can be either through an Internet Gateway or a NAT Gateway (or NAT Instance).

If you have this done right, your instance(s) should pop-up under ‘Managed Instances’ in the EC2 console:

Take note of the SSM Agent Version. As said earlier it must be at least version 2.0.599.0. The Systems Manager Service also requires a “Patch Group”-tag on the EC2 instance. The key for a patch group tag must be Patch Group. Note that the key is case sensitive. The value can be anything you want to specify, but the key must be Patch Group.

If done correctly, your tag will be picked up by SSM. You can confirm this on the ‘Managed Instances’ page:

 

Patch Baselines

AWS provides a default Patch Baseline called ‘AWS-DefaultPatchBaseline’. It auto-approves all critical and security updates with a ‘critical’ or ‘important’ classification seven days after they have been released by Microsoft. If you’re happy with that you can use this baseline. If you’re not, you can simply create your own according to your requirements: set approval for specific products and patch classifications, exclude a specific KB etc

Once your happy with your baseline, you can hit ‘Create’. Now assign it to one or more Patch Groups (or make it the default baseline and throw away the AWS one). Hit the ‘actions’ menu and chose ‘Modify Patch Groups’

Type the names of the Patch Groups you defined when tagging your instances

Your baseline is now attached to the specified patch groups. You can now start evaluating your instances against the baseline, and update them accordingly.

Patching

Applying the patch baseline to a specific instance or to a patch group is nothing more than executing an AWS SSM run command. You can schedule this run command through AWS SSM ‘Maintenance Windows’, a cron job on a server (like Rundeck) or manual through the AWS Console.

Let’s first check everything manually. In the AWS EC2 console, go to ‘Run Commands’ and create a new Run Command. Select the ‘AWS-ApplyPatchBaseline’ command document and pick an instance run this on. For the ‘operation’, choose ‘Scan’. This will evaluate the instance against the baseline without installing anything yet.

Once the run command finishes, you can go back to the ‘Managed Instances’ page. Highlight the instance(s) on which the run command was executed and click on the ‘Patch’ tab. Here you can see the result of the scan:

To actually install the missing updates, execute the same run command document, but now with the ‘Install’ operation. This will install the missing KBs to the instances and reboot them if needed.

Or execute the following aws cli command to accomplish the same:

Maintenance Windows

In stead of manually starting a run command or cron job, we can also use the AWS provided Maintenance Windows feature. Systems Manager Maintenance Windows let you define a schedule for when to perform actions on your instances such as patching the operating system. Each Maintenance Window has a schedule, a duration, a set of registered targets, and a set of registered tasks.

Before actually creating a Maintenance Window, we must configure a Maintenance Window role. We need this so Systems Manager can execute tasks in Maintenance Windows on our behalf. So we go to the IAM page and create a new role. We pick an “EC2 service role” type and make sure to attach the “AmazonSSMMaintenanceWindowRole” policy to it. Once the role is created, we must modify it. Click “edit Trust Relationships”. Add a comma after “ec2.amazonaws.com”, and then add “Service”: “ssm.amazonaws.com” to the existing policy:

Back to SSM now to actually create the Maintenance Window. Give it a useful name and specify your preferred schedule. I’m setting ‘every 30 minutes’ just for demonstration purposes, but in a real setup you would most probably choose something like ‘Every  Sunday’. You can also configure your own Cron expression.

This leaves us now with an empty Maintenance Window: there are no tasks nor targets associated yet.

To assign targets to the Maintenance Window, click on the “Register new targets” button on the “Targets” tab. We dynamically select the targets by using the “Patch Group” tag.

We will now have an ID linked to our “dev” Patch Group. This “Window Target ID” is used in the next step.

From the “tasks” tab of the Maintenance Window, click on “Schedule new task”. Pick the “AWS-ApplyPatchBaseline” document. Under “Registered Targets”, select the correct Window Target ID. For the operation, select “Install”. For the “Role”, select the IAM role with the AmazonSSMMaintenanceWindowRole attached to it (the one we created earlier). Set your preferred concurrency level and register the task by clicking on the blue button. The end result should look like this:

Now we have to wait for the schedule of the Maintenance Window. In this example we specified ‘every 30 minutes’ as a schedule, so the waiting shouldn’t take too long. Under the ‘History’ tab of the Maintenance Window you can follow all actions. The Maintenance Window will simply launch a Run Command, so you could go to that console screen too. If you enabled logging to S3, you could find the output of the Run Command over there. If not, you can view a (truncated) output via the Run Command itself:

If we now go back to the “Managed Instances” page and look at the “Patch” tab of our test instance, we will see it is not missing any updates anymore!

 

Success! Another  on the Automation checklist!

 

Rutger

Cloudar as a Next Generation MSP

Historically, Managed Service Providers (MSP) had the broad responsibility of taking care of customer environments from the bare metal up to the Operating System and sometimes the Application level. In the new world of Cloud Computing, things are changing at a rapid pace and responsibilities are shifting. Where the datacenter and physical layer of cloud computing are the sole responsibility of AWS, MSP’s are moving up in the stack and will focus on cloud security, consultancy, application level monitoring, cost control and high availability.

What can one expect from a (as AWS tends to call it) Next Generation MSP? It all starts with knowledge. An MSP has engineers that are both thoroughly trained and certified on AWS. At Cloudar we have a development track that makes sure all our engineers are AWS Certified. We not only have a high percentage of Associates, but also a good number of Professionals, and even one holding all five certificates. This creates an internal eco system, facilitated by Slack, where knowledge is shared and customer issues are quickly discussed and solved. Cloudar is an Advanced Consultancy Partner, with a broad network within AWS. 

Where automation was important in traditional managed hosting, it is vital in an AWS environment. And thanks to the broad range of API’s available on AWS, the sky is the limit. Now why is automation that important nowadays? I see three reasons. First in terms of cost. Build Once, Deploy Many. If you need to repeat a task, script it. It will be cheaper in the long run. Second, and this is often overlooked, in terms of security. A lot of security issues stem from human error. By scripting repeatable processes, and peer reviewing these scripts, the chances of creating unintentional security holes are greatly diminished. Third in terms of usability. When using proper source control and deployment tools, everyone can deploy new environments or applications with a click on the button. A next-gen MSP will apply all these skills to setup and manage your environment.

A traditional MSP was mostly concerned with threshold based monitoring. Are my servers still online, is my hardware healthy and are my disks not full? While some of these are still very relevant, monitoring will shift more towards application level monitoring. From uptime of servers to uptime of applications. At Cloudar, on top of traditional threshold bases monitoring, we offer Application Performance Monitoring and even Real User Monitoring. This way you not only know whether all components of your application are healthy, but also that the application itself works within expected boundaries. We have our standard set of monitoring tools to deal with this, but are also happy to assist you in using third party tools like Datadog or New Relic. You will get your own dashboard to check your environment health at any time.

It has never been easier to build highly available environments. In the old days, it took weeks if not months to setup multi-datacenter solutions. From ordering hardware, configuring global load balancing, storage replication, VMWare SRM… AWS has all infrastructural requirements built in. This means it has become second nature to always start from a high availability scenario with at least two Availability Zones in mind. From there on, a next-gen MSP will look at your workloads and determine what the best way is to run them in the cloud. In all this, Cloudar acts as your trusted partner, and determines what the best course of action is. This can range from a traditional lift and shift, over cloud optimized to a new cloud native deployment together with one of our application development partners.

Controlling costs is in the DNA of AWS, we made it our own to do the same. This means we will not only design the most cost effective environments for our customers, but also will continuously assess whether this is still the case during the lifetime of your setup. We do this on two ways. Primarily, we follow up on all new AWS announcements and will check what the impact can be for our customer base. A small example can be seen in a previous blog post: About AWS and Saving Money. If we see ways you can save money by using new services, we will let you know. Second through the use of Cloudcheckr. This tool will scan your environment and will make recommendations on downsizing instances, unused resources, buying RI’s and all other cost saving options.

As you can see, things are changing in MSP land. An MSP does not solely host your servers anymore. It is your partner in a cloud world that lives by principles and processes of DevOps. Cloudar is born in the cloud. Many of the changes traditional MSP’s need to make to stay on board, are in our DNA.

Cloudar, a short recap…

Today, exactly 2 years ago, Senne Vaeyens and myself hired Ben Bridts as our first employee at Cloudar.
What started out as a great idea transformed into a solid business model and shaped Cloudar as the company it is today. With 15 dedicated full-time engineers on the payroll we are now able to provide first class support and advice in AWS.
Trusted by many customers (from startups to large enterprises), we have proven our expertise in Amazon Web Services, DevOps and Managed Services along the way and we’re planning to take this to an ever higher level in 2017.
In 2017 we will:
– Grow our business and keep on extending the team (feel free to contact me if you would like to join)
– Take our solid partnership with AWS to the next level
– Establish more vendor partnerships with AWS Technology Partners
– Extend our customer base
– Obtain several AWS competences, including the AWS Managed Services, DevOps and Big Data Competences.
– Improve customer service and provide top-notch AWS support and expertise to our customers
– Explore new markets and technologies

All of this wouldn’t be possible without the help of our great team, so please join me in giving a big thumbs up for the entire Cloudar Team.
THANKS GUYS!!

Cloudar team meme

If anyone would like to know more about the services Cloudar can provide to you as a customer, feel free to drop me an mail at bvh@cloudar.be or send me a PM.

Cheers,

Bart

Using the Application Load Balancer and WAF to replace CloudFront Security Groups

If you’ve been using a Lambda function to update security groups that grant CloudFront access to your resources, you may have seen problems starting to appear the last few days. There are now 32 IP ranges used by CloudFront, and you can add only 50 rules in a security group. This seems fine, but if you want to allow both HTTP and HTTPS,  you’ll have to split the 64 rules over two groups. This may limit you in other ways, as you can add only 5 security groups to a resource.

You can replace this lambda with the recently launched WAF  (web application firewall) for ALB (application load balancers) .

Here is how to do that (assuming you already have a CloudFront distribution and Application Load Balancer setup).

CloudFront configuration

  1. Go to the “Origins” tab of the Distribution you want to use and edit the origin that’s pointing to your ALB.
  2. Add a new Origin Custom Header. You can use any header name and value you like, I opted for “X-Origin-Verify” with a random value
    edit origin

WAF/ALB Configuration

  1. Go to the WAF service page and create a new Web ACL
  2. Give the ACL a name and select the region and name of your ALB
    acl config
  3. Create a new “String matching condition”. We’ll create one called “cloudfront-origin-header” that will match when our custom header has the same random value.
    header-rule
  4. (Optional) If you want to allow your own ip, without the secret header for testing purposes add an “IP match condition” that will match the IPs you trust. We have named that condition “trusted-ips”
    ip condition
  5. Now we can create a rule to allow requests that match the conditions we created. Click on “Create rule”  to create a rule for all requests with our custom header.
    header rule
  6. (Optional) Do the same for a rule with the IP condition
    trusted ip rule
  7. Configure the ACL to allow the rules we just created and block all requests that don’t match any rules
    acl create

Result

If you surf directly to the ALB with an untrusted IP address, you should now see a 403 page:

screenshot-2016-12-09-15-35-28

However, when you add the Custom header, or go through CloudFront, you are allowed to visit the website:

curl alb allowedcurl cloudfront

Caveats

This service is very new, so while setting this up, we ran into some rough edges. We’ve opened  a support request so that AWS can look into fixing those.

  • You can’t see the ACLs you created inside a region (WAF for CloudFront is a global service) if you use the CLI. According the the documentation, you should be able to do this if you override the endpoint url. At the time of writing this gives errors. If you want to try if this has been fixed you can use this command: aws waf list-web-acls --endpoint-url https://waf-regional.us-east-1.amazonaws.com
  • Currently there are no metrics available for the WAF inside a region (even though you have to specify a metric name for the rules and conditions you create).
  • If there are no healthy hosts in the target group of your ALB, you will always get a 503 error response. Even if the requests gets blocked by the WAF.

Troposphere helper functions

Here at Cloudar we write a lot of CloudFormation to provision AWS resources. We really like the way CloudFormation creates resources in parallel and how it provides an easy way to clean up all created resources.

However writing CloudFormation can be a bit of a pain. Even though AWS made this a lot easier with YAML support, for  big templates we still use Troposphere. Troposphere is a python package that provides a simple (one-on-one) mapping to CloudFormation. It has some advantages like offline error checking, but its greatest assets is that it can be used in combination with a real programming language.

Having python available to write templates, leads to writing helper functions to simplify some verbose constructs and bundle commonly used resources. Today we’re happy to publish this as on open source package.

You can find our troposphere helpers on pypi or on github.

Here is an example of how it can simplify CloudFormation code:

This is the normal way to add Option Settings to an ElasticBeanstalk configuration in Troposphere:

Using our helper function this can be reduced to:

And this is only one of the different functions in there (and we expect to keep adding to this over time).

Do you have any helper functions you’ve written? Let us know!

Using Route53 to support on-demand applications

One of the best ways to save money on AWS is turning resources off when you don’t use them. This is pretty easy to automate if you have consistent usage patterns (like an application that’s only used during business hours), but can be harder if the usage is very irregular (for example an application that’s only used a few times per quarter).

We recently worked with a customer that had some applications that could be without usage for months. To be more cost efficient, they were looking for a solution where:

  • They could turn off as much instances and services as possible
  • The users could start the application with one button click if they needed to use it
  • The users didn’t have AWS credentials

We came up with the following solution to satisfy these requirements, and if you’re running the same kind of applications, maybe you can also reduce costs by implementing this.

Failover Diagram

This solutions works by taking advantage of the Route53 health checks. We’ve split up our infrastructure in two parts: an always-on part that uses low cost or usage based services to provide the user with a way to start the real application; and a part that can started and stopped on demand.

We configure the on-demand part to be the primary resource in Route53 and the always-on part as a failover. This way the traffic will be routed to the real application if it’s online, and the user will get a static webpage that gives him the option to start the application if it’s not.

If we look at how this would go if the application is offline, these are the steps that would happen:

  1. The user requests the DNS record for application.example.com from Route53. Because the real application is offline, Route53 will respond with the recordset of the fallback CloudFront distribution.
  2. The user request a page from CloudFront. CloudFront will get this from S3 and serve it to the user. This page contains an explanation of why the application is not available and a button to start it.
  3. When the user clicks the button, it uses javascript to call the API Gateway and invoke a lambda function.
  4. The lambda function calls Service Catalog or CloudFormation (depending on your environment) to start the real application
  5. When the application has started, the health check will pass, and Route53 will start returning the recordset for the CloudFront distribution that is linked to the application
  6. When the user uses the new DNS records, it will go through the second CloudFront distribution and to the real application

Some things to keep in mind.

This only a high level overview of a possible solution. To implement this, you would also have to consider the following:

  • After starting the Application, the static webpage should refresh the page, to force the browser to do a new DNS lookup.
  • CloudFront will cache errors for 5 minutes by default. Decreasing this will make the failover go faster.
  • The TTL of an CloudFront DNS record is 60 seconds

No Limit?

[A:] No no limits, we’ll reach for the sky!download (4)
No valley to deep, no maintain to high
No no limits, won’t give up the fight
We do what we want and we do it with pride
No no, no no no no, no no no no, no no there’s no limit!
No no, no no no no, no no no no, no no there’s no limit!

No limit? Well, actually there is. Several actually. And that became painfully clear yesterday, when I was scripting the new environment for one of our customers. Not using Troposphere, so it can more easily be managed by non-Python savvy people.

What they need is not that special. They want to be able to deploy identical environments fast and easy. Not very complex environments either. Mainly EC2 and RDS. Say 10 servers and 5 DB instances.

But you know how it goes. All servers in an environment have different disk layouts. Different instance types. Different availability zones. And while the requirement now is to deploy completely identical environments, you know the day will come someone will come up to you and ask: why are we using SSD disks in our Dev environment? Why are those partitions so large in Test? So it’s best to be prepared, and allow for some flexibility. The plan was to create a CloudFormation script, and deploy it using Ansible. All configurable parameters can then be put in Ansible in an easy Yaml structure instead of -for example- a JSON parameter file.

So I started writing the code to create one server and its backend RDS instance, thinking: if I get this straightened out, it’s just a matter of copy pasting it for most other servers and instances, and setting server specific parameter values in Ansible. Well, pretty soon I hit the first AWS limit: one can only have 60 parameters for a CloudFormation template. I had many more. Bummer. I first looked into nested Stacks to overcome this limit, but as you can’t pass parameters straight to a child stack, they were not the answer here. They are an answer to a different problem though, but more on that later.

The best way to work around parameter limits, are mappings. It’s not ideal though, as my goal was to only configure new environments by creating a new playbook in Ansible and never having to touch the template code for this. Unfortunately, that is not an option. I now create a mapping per environment, and configure most variables there. The environment to deploy is passed as a parameter, which can then be used to search through the mapping and values are read using the Fn::FindInMap function. Pretty much as show below:

So yeah, I was pretty pleased with the result. I was able to rewrite my code and transfer a lot of parameters to mappings. A new environment would now mean creating a new entry in the map. Not that big a deal. And hey, one can have a hundred mappings per template. We will never have that many environments. We are golden! Well… until I started to copy and paste all mapping entries… There I hit the second limit. One can only have a maximum of 63 mapping attributes. OK, that is 33 more than what is stated in the official documentation, but with the variables I wanted and the amount of servers, that was not nearly enough.

Now what? Well, back to the Nested Stacks. While they are not an answer to the parameter limit, they are to the mapping attributes boundary. When I create a child template for each type of server with its RDS instance, I don’t need that many mapping attributes per template, and all is well again. You can also pass parameters from parent to child, like this:
And in the child stack you declare the parameter again and pick it up with a Ref:
Granted, at first sight it adds more complexity to the code. On the other hand it makes it more modular, and we probably are now safe from some other limits like the amount of resources per template, the maximum size of your template file or the total amount of swirly brackets you can have in one template. Actually I made that last one up, but for a complete list you can check the documentation at https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/cloudformation-limits.html.

About AWS and saving money, new EBS disks, backups… and beer

AWS is one of the few companies that actually try to take less money from their valued customers. Take the Trusted Advisor as an example. A web application available to everybody with an AWS account that will tell you where you are spending too much money on AWS resources. Resources you don’t use or are underutilized. It will tell you exactly how much money you can save by downgrading instances, removing idle load balancers or downsizing EBS volumes.

We at Cloudar try to incorporate that same philosophy. We actively seek for ways to make our customers pay less money for more or less the same level of service.

One way we could save quite some money for one of our customers recently, was by starting to use the newly introduced HDD drives. They are the answer to all your backup to disk needs.

Previously, using magnetic storage (standard) was not always a valid option. The max throughput was considerably lower than the throughput of an ssd volume. Another issue was the maximum volume size, which was 1 TiB. If you needed large amounts of data to be stored on disk, this was not always enough.

So we h2016-04-27 20_42_24-Product Detailsave a customer who was, for reasons mentioned above, backing up Oracle to disk on ssd. Right when we received the message from AWS that sc1 storage was available, we contacted our customer and asked them whether they wanted to cut their EBS cost for backup to disk in 4.

They did.

Looking at the figures (https://aws.amazon.com/ebs/details/), sc1 is the ideal volume for backups. It is cheap (one fourth of the price of ssd), and has a high throughput. In fact, it has a considerably higher throughput than a standard ssd volume! This at the expense of IOPS, where ssd still is king. But for backups, random IOPS are not important, throughput is.

And not only is it one fourth of the price of an ssd disk, it is even cheaper than S3 standard storage. And it is a lot cheaper than what you pay for S3 storage to store your snapshots. In case you were unaware: storing snapshots on S3 is not at normal S3 cost. It’s more than 3 times the cost of s3 storage. All things you need to keep in mind when finding the best scenario for you.

2016-04-27 20_43_01-Amazon Elastic Block Store (EBS) – Pricing

So that day, our customer dba started to write his backups to the new sc1 volume. The result? No change. It took exactly the same amount of time. What does this tell us? First that the disk probably was not the bottleneck in this scenario. It’s more likely that Oracle could not deliver the data fast enough to hit any limits on disk level. Second, sc1 is an alternative to ssd in distinct cases. Third, the customer now pays 1/4 of the price for his backups. It saves him a few thousand dollars per month. He is happy.

I would urge you try it out for yourself. Just add an sc1 disk (or an st1 depending on the scenario), and do the test. It’s cheap to test, and easy to throw away if it doesn’t suite your needs.

So always be on the lookout for new AWS announcements. One day, you will be able to save some dollars. Dollars you can then spend on other cool AWS features. Or, of course, on beer.

1373oc

P.S. Prices in this post are for eu-west-1. They can differ per region and are subject to change.

Automating Windows migrations to Aws with Double Take Move and Ansible

Intro

When you’re a cloud reseller/architect you often get contacted by customers who want to migrate their infra to Aws.

Although I’m not really for the lift and shift way of working, sometimes there is no way around it.

 

Instead of spending hours of work on installing and configuring, exporting, importing, etc… we can now really get things going by using Double Take Move and Ansible.

For this article you need some basic knowledge of Ansible.

A good place to start is (http://docs.ansible.com/ansible/ )

 

Double take move is really well made and very user friendly!

And the license cost to use this product is forgotten easily when you don’t have to spend hours in exporting-importing – troubleshooting these kind of moves.

(http://www.visionsolutions.com/products/windows/double-take-move/overview)

 

 

Prep work

Ah yes, there is always some prep work to do. (more if your not already using Ansible now)

AWS

Lets first configure the Aws environment, create the needed vpc, subnets,vpn’s, seurity groups, roles,etc…

VPC

Make sure that the vpc cidr block and subnets match your current setup exactly.

(And also create the target servers in the correct subnets)

DHCP option set

Make sure that you create a dhcp option set if the servers you are migrating are on dhcp.

Example:

 

Public ip’s

Maybe for connectivity to work you will need to attach some public ip’s to your source and target servers.

We used this mostly in Azure to Aws migrations.

 

Ansible Setup

We have been using Ansible for quite some time now and we find that installing it on Centos or Ubuntu is the best way to go.

Below are the basic steps for Centos

  • We mostly use at least Ansible version 2, therefore we need to enable the the epel-testing repository to install Ansible. Edit the file under /etc/yum.repos.d/epel-testing.repo to enable it. Then run the below commands

 

Accept defaults for the keygen. Or change to your way of working (you can then push this key out to linux server, which are not in scope of this blog)

  • Install pywinrm

  • If your windows systems are in a domain (most of them normally are) install the Kerberos dependencies

  • You will also need the python part to this

 

Please read the kerberos documentation carefully, as your really need this to be correct and working.

 

Kerberos

Edit the /etc/krb5.conf file and change it to reflect your domain

 

 

When that is done you can test the connection is working by running the below command

 

 

If nothing is returned -> don’t panic!!! Then it worked !

You can then check your Kerberos ticket with the command

 

 

Inventory

Under your /etc/ansible directory there is a hosts file.

It contains some examples in how to use an Ansible Inventory file.

Create yours any way you like.

But for the these migration you can do something like this

 

 

For each group you create here you can/must create a credential file with the same name.

So in this case a sourceservers.yml and targetservers.yml

Store these under the /etc/ansible/group_vars.

Content of these files for local users

 

 

Content of the file for domain users

 

 

You can also add it in the hosts file like so

 

 

Windows Configuration

Make sure your target servers are as identical as possible to your source servers.

So same os ,service pack,IP and disk layout and your good to go. (ooh and don’t rename your target server to the source server just yet, double take will complain and will not continue. But a usefull name is a good way to identify the target server)

You will need to do the below on all source and target servers, for the target servers you can maybe create an ami from which to deploy, depending in how many servers you need to migrate.

  • Configure winrm on all windows machines that you need to migrate (script for this can be found here : http://docs.ansible.com/ansible/intro_windows.html
  • Also make sure you have at least version 3 of powershell installed, so basically check all your servers that are below server 2012.
  • Preferably create an “ansible” user on those systems and allow it to connect through winrm (there is a local group called WinRMRemoteWMIUsers__, add it to this group. Also the local admin, else you will not be able to do everything that is needed here)
  • Because of Ansible’s way of spawning allot of connections I found that increasing the MaxShellsPerUser parameter for winrm to give less problems.

Command :

Hint: You can combine the above in the ConfigureRemotingForAnsible.ps1 that you download from the ansible site, by added the following on the bottom of the script

I found that in most cases you will need to reboot the server in order for it all to work correctly.

 

Firewalls

Ofcourse we need to modify some firewall rules here and there.

Make sure that ansible can reach your servers on 5986 tcp.

Also make sure that source and target servers can speak with each other directly over port 6320 and 6325 tcp and udp.

The double take console will also need to speak with all servers on these ports.

 

Note: ofcourse make sure that all other needed rules,routes,vpn’s,etc are in place for your servers.

 

Test test test

We can now test the connection to the windows servers.

(If you are using the domain credentials make sure you have a valid Kerberos ticket first.)

Run the following

to verify the source server connections

to verify the target server connections

 

Double take console

I’m not going to go into details here but on the machine you have installed the double take console you add all the servers (source and target), attach the licenses to them and setup full server replication jobs with the parameters of your choice.

Wait before failing over, we will need some more playbooks depending on your server licensing.

 

Playbooks

On to the interesting stuff, unless you want to manually install double take software on all servers then go do that now 🙂

Doubletake

I downloaded the doubletake software, and unzipped the following directory /setup/dt/x64 folder and placed it in a S3 bucket. If you have 32-bit servers extract also the 32bit folder. The below examples only use the 64bit installer… if the need arises we can create also the 32bit playbook.

Make sure the files are public else you will not be able to download it on the source servers, use a the readonly S3 policy attached to a role for the targetservers.

Before uploading also modifie the DTsetup.ini file to allow a quiet installation. (modify it anyway you want, make sure that the diskqueue folder has around 20gb of free space)

 

 

When the above is done, we can continue to write our playbooks.

Write the following playbook, place it in the /etc/ansible directory.

 

 

Then create the following directory structure

/etc/ansible/roles/doubletake/tasks/

The create the following “role”, save it as main.yml

 

Now we can test the doubletake installation like this

Or if you encrypted the files with vault then

 

If everything is working as it should then doubletake should be installed everywhere, nice and fast no?

 

 

Windows Licensing

 

It all depends what you want to do, but this example will change the windows activation to the Aws kms servers, thus using the aws licencing instead of your own or…

Source : “Unable to activate Windows”

Ok that will be a lot of manual work, so let’s not do that.

 

Since ansible is still a work in progress I found that the module win_unzip does not work all the time.

Therefore I chose to put the ec2install.exe also in an S3 bucket.

(wanted to do download the latest ec2config service from amazon and unzip it , then install it…if it works better in the future I’ll make an update)

 

Write the following playbook

 

 

The create the following directory structure

/etc/ansible/roles/windowsactivation/tasks/

Then write the following main.yml and place it in the dir above

 

DNS forwarder

If you have a domain running you probably also have windows dns, because you now going to move to aws, we need to change to forwarder to aws.

!The below script will replace all your forwarders! If you don’t want this then there is also ‘add-dnsserverforwarder’ and ‘Remove-DnsServerForwarder’.

 

So maybe create a group in the ansible host file [activedirectory]

 

To find the ip for the forwarder take your VPC cidr block and change the last digit to 2.

Example: 10.41.0.0/16 the dns forwarder is at 10.41.0.2

source : (VPC Subnets –> subnet sizing)

 

Create the below playbook under /etc/ansible

 

 

Failover

 

Right we have the necessary components now, let do the failover to aws

In the double take console start failing over your servers, best to start with the core servers, like AD, then maybe SQL, exchange.

Then applications servers and webservers… it’s really up to you

 

When everything is failed over, check to see if ansible is able to reach your servers.

(With Kerberos or local user)

Also it can get confusing now because your target servers are also your source servers now! 🙂

 

Anyway, run the setdnsforwarder.yml first to make sure you have internet access.

Then run the windowsactivation.yml

 

Everything should now reboot and come back online, activated with the aws kms server.

Since this is a repeatable process you can first do a testfailover, test this out, tune where needed, then do the actual failover.

 

If you have questions or just don’t want do to this yourself, contact us by email or phone (+32 3 450 80 30).

 

Go Automate Something!

Pauwel

How to use AWS EC2 – GPU Instances

How to use AWS EC2 – GPU Instances 0n Windows

 

Content Table

About GPU Instances

When would you use a GPU instance
G2 family
NVIDIA GRID
Which Remote Desktop solution is recommended?
G2 In Action

 

Let’s get started

Use an existing AMI
Create your own Instance
Connect to the desktop
Use your 3D application or streaming application

Game Setup

 

 

 

About GPU Instances

When would you use a GPU instance

Do you want to build fast, 3D applications that run in the cloud and deliver high performance 3D graphics to mobile devices, TV sets, and desktop computers?

If you require high parallel processing capability, you’ll benefit from using GPU instances, which provide access to NVIDIA GPUs with up to 1,536 CUDA cores and 4 GB of video memory. You can use GPU instances to accelerate many scientific, engineering, and rendering applications by leveraging the Compute Unified Device Architecture (CUDA) or OpenCL parallel computing frameworks. You can also use them for graphics applications, including game streaming, 3-D application streaming, and other graphics workloads.

 

G2 family

Features:

  • High Frequency Intel Xeon E5-2670 (Sandy Bridge) Processors
  • High-performance NVIDIA GPUs, each with 1,536 CUDA cores and 4GB of video memory
  • Each GPU features an on-board hardware video encoder designed to support up to eight real-time HD video streams (720p@30fps) or up to four real-time full HD video streams (1080p@30fps)
  • Support for low-latency frame capture and encoding for either the full operating system or select render targets, enabling high-quality interactive streaming experiences
Model GPUs vCPU Mem (GiB) SSD Storage (GB)
g2.2xlarge 1 8 15 1 x 60
g2.8xlarge 4 32 60 2 x 120

For more information: http://aws.amazon.com/ec2/instance-types/
https://aws.amazon.com/blogs/aws/build-3d-streaming-applications-with-ec2s-new-g2-instance-type/

 

 

NVIDIA GRID

What is NVIDIA GRID

NVIDIA GRID takes key technologies from gaming, workstation graphics, and supercomputing, and enables high performance Applications and Games to run on the Cloud, with pixels streamed over IP networks to remote users. The GRID SDK enables developers to efficiently render, capture and encode on a server, while decoding the stream on a client. NVIDIA incorporates a lot of the functionality within the GPU driver, enabling high performance graphics to be rendered to a remote client with minimal latency.

g2_gpu_model

 

OpenGL and DirectX

You have access to a very wide variety of 3D rendering technologies when you use the g2 instances. Your application does its drawing using OpenGL or DirectX. If you would like to make use of low-level frame grabbing and video encoding, you can make use of the NVIDIA GRID SDK.

 

What is NVIDIA GRID SDK

The GRID SDK enables fast capture and compression of the desktop display or render targets from NVIDIA GRID cloud gaming graphics boards. You will need to register in order to download the GRID SDK. Registration is free, and the download page can be found here:  https://developer.nvidia.com/grid-app-game-streaming.

The GRID SDK consists of two component software APIs: NvFBC and NvIFR.
a) NvFBC captures (and optionally H.264 encodes) the entire visible desktop. In this model you generate one video stream per instance.
b) NvIFR captures (and optionally H.264 encodes) from a specific render target. You can generate multiple streams per instance.

For more information: https://developer.nvidia.com/sites/default/files/akamai/gamedev/grid/grid22/grid-sdk-faq.pdf

 

When should NvFBC and NvIFR be used?

NvFBC is better for remote desktop applications.
NvIFR is the preferred solution to capture the video output of one specific application.

 

 

What is the NVIDIA Video Codec SDK – NVENC

The NVIDIA Video Codec SDK is a complete set of high-performance NVIDIA Video Codecs tools, samples and documentation for hardware encode API (NVENC API) and hardware decode API (NVCUVID API) on Windows and Linux OS platforms. This SDK consists of two hardware API interfaces: NVENC for video encode acceleration and NVCUVID for video decode acceleration with NVIDIA’s Kepler and Maxwell generations of GPUs

 

Download NVIDIA NVENC – Video Codec SDK

The latest NVIDIA Video Codec SDK version available is 6.0, which requires NVIDIA GPU driver R358 or above for Windows and R358 or above for Linux.
You can download the required drivers here Windows,Linux

 

How do I enable NvFBC?

Windows: NvFBC needs to be enabled after a clean installation.

{Installation Directory} \bin> NvFBCEnable.exe -enable
Use the NvFBCH264 SDK sample to capture the desktop to a H.264 file.

Linux: For the NvFBC GRID API functions, please refer to the flag

NVFBC_CREATE_CAPTURE_SESSION_PARAMS.bWithCursor

 

What are the optimal encoder settings for desktop applications streaming?

With the preset setting LOW_LATENCY_HP , a single GPU can encode 6-8 streams (exact number depends on other settings such as RC mode). LOW_LATENCY_HQ preset will give 4-6 streams (exact number depends on other settings such as RC mode).

The HQ preset will give you slightly better quality, but the performance will be lower than that of HP. To get the ideal performance for each preset, you can use PerfMSNEC sample application in the SDK.

GRID SDK exposes 3 video encoder presets (LOW_LATENCY_HQ, LOW_LATENCY_HP and LOW_LATENCY_DEFAULT). As the names suggest, these presets are meant for encoding for low-latency applications such as remote streaming, cloud gaming etc. Please refer to the API reference for more details on each parameter.

 

Which Remote Desktop solution is recommended?

For setup, debugging, and configuration for these servers, it is recommended to use TeamViewer. This software allows a remote client to connect any one desktop window at a time. In comparison, VNC will also capture and stream with the NVIDIA GPU accelerated driver, but for baremetal systems, it will capture all desktop windows. This adds additional overhead and results in reduced performance when streaming. The overhead for capturing one desktop window in TeamViewer is significantly less vs capturing all windows desktops with the VNC solution.

Note: TeamViewer uses a proprietary compression format for remote streaming. The NVENC H.264 hardware engine is not used by TeamViewer.

For streaming applications and games (not using H.264 for compression), TeamViewer is the recommended solution for streaming the desktop.
For the best experience on GRID, use NVENC with H.264 compression for the best remote streaming experience.

How to install Teamviewer

 

Can I use Microsoft Windows Remote Desktop?

While it is more efficient in terms of bitrate and performance of remote graphics in comparison to VNC, Microsoft Remote Desktop uses a proprietary software based graphics driver that does not support all of the NVIDIA GPU accelerated capabilities, and does not enable the NVIDIA GRID driver. Any applications running under Microsoft Remote Desktop will not be using the NVIDIA driver and will not have full benefits of GPU acceleration.

 

 

G2 In Action

Amazon has been working with technology providers to lay the groundwork for the G2 instances. Here’s what they have to offer:

  • Autodesk Inventor, Revit, Maya, and 3ds Max 3D design tools can now be accessed from a web browser. Developers can now access full-fledged 3D design, engineering, and entertainment work without the need for a top-end desktop computer (this is an industry first!).
  • OTOY’s ORBX.js is a pure JavaScript framework that allows you to stream 3D application to thin clients and to any HTML5 browser without plug-ins, codecs, or client-side software installation.
  • The Agawi True Cloud application streaming platform now takes advantage of the g2 instance type. It can be used to stream graphically rich, interactive applications to mobile devices.
  • The Playcast Media AAA cloud gaming service has been deployed to a fleet of g2 instances and will soon be used to stream video games for consumer-facing media brands.
  • The Calgary Scientific ResolutionMD application for visualization of medical imaging data can now be run on g2 instances.  The PureWeb SDK can be used to build applications that run on g2 instances and render on any mobile device.

Here are some Marketplace products to get you started:

 

 

 

Let’s get started

Use an existing AMI

To simplify the startup process, NVidia has put together AMIs for Windows and Amazon Linux and has made them available in the AWS Marketplace:

or

Create your own Instance

View your current drivers version

When you launched a new G2 Instance, the GPU drivers are not installed by default.

Download GLview, to view the current version.

              opengl01

Microsoft Windows ships by default with OpenGL v1.1 drivers.
https://www.opengl.org/documentation/implementations/

 

 

Install latest NVIDIA drivers

Go to the NVIDIA website to download the latest drivers:
http://www.nvidia.com/Download/index.aspx

 

Choose option 2:

image5

image6

image7

 

There is no Current Version:
image8

image9

Reboot windows
image10

 

Disable the build in Adapter

Open ‘Computer Management’ and disable the ‘Microsoft Basic Display Adapter’.
Computermanagement

Reboot the server.

 

Install  Features

Media Foundation Package and Quality Windows Audio Video Experience

Media Foundation components in Windows Server 2012 need to be installed.
Start the Server Manager, then click the Add roles and features link in the Welcome Tile. If you closed the Welcome Tile, make it visible again via the View menu or in the left panel navigate to Local Server. In the right panel scroll all the way down to Roles and Features, click the Tasks dropdown and choose Add Roles and Features.

qWave is a QoS software modules for running on qWave-enabled devices for Audio/Video Streaming over networks. Designed for handling Audio and Video, you can find out more details about this module by clicking here. Click on Start, open a console window and then type inside the window.

Features

Reboot the server.

 

Connect trough RDP

When you connect trough RDP and start GLview, you will see the same driver version v1.1, as before the installation of the drivers.
Any applications running under Microsoft Remote Desktop will not be using the NVIDIA driver

Also when you try to start the NVIDIA Control Panel, you receive an error.

image11

 

 

Connect to the desktop

We have to use another Remote Desktop Tool to use the NVIDIA GRID.
There are many Remote connection tools, but for best performance we will install Teamviewer.

Which Remote Desktop solution is recommended?

 

Check the local firewall to allow Teamviewer

Teamviewer will add items by default to your windows firewall, but if you use other software, check the local windows firewall to allow the incoming connection.

image13

 

Check the AWS Inbound Security Group to allow Teamviewer

Check your Security Group for Inbound connections, to allow the port where your service is running on.

For example, Teamviewer will try to connect on port 80 or 443 or 5938

SecurityGroup

 

Setup Teamviewer

Login with an RDP session. Download Teamviewer and start the installation.

Be sure to select the unattended option.

teamviewer-02

teamviewer-03b

You have to create an account if you would like to access your server unattended.

teamviewer-04b

teamviewer-05

Because we installed Teamviewer with the unattended option, the ‘Start Teamviewer with Windows’ is selected.
To access your server with Teamviewer as a service, assign this server to your account.

teamviewer-10

Now you see that the computer name and not the ID is visible on your other computer.

teamviewer-12    teamviewer-13

 

 

 

Connect to the desktop with Teamviewer

teamviewer-11

Now you can open the NVIDIA Control Panel from Teamviewer.
Computermanagement2

And you can see the correct Renderer:
image15

 

Use your 3D application or streaming application

Now your application can make use of the NVIDIA compute capability.

 

 

Game Setup

Enable H.264 video encoding

The GRID cards can offload the H.264 video encoding to the GPU. Sign up for a developer account with NVidia. Download and extract the GRID SDK. In the bin directory run the following: NvFBCEnable.exe -enable -noreset

and reboot.

NvFBCEnable

 

Enable Audio

Start the Windows Audio Service and set the startup type to Automatic.

audio

 

Install virtual soundcard

You can install Razer Surround to get a virtual soundcard.

razersurround

 

Play your Game

Try a game on the virtual server or stream it to your desktop, laptop, other.

 

Game Benchmark

I have tried the Unigine Heaven Benchmark

benchmark01

benchmark03

benchmark02

You can find other free Benchmarks here:
http://www.ozone3d.net/index_softwares.php

benchmark04

 

 

 

 

Automatic login for Windows Server 2012 R2

Option 1: netplwiz.exe

Use the netplwiz.exe app that comes standard with 2012. Search for the app through the start menu, or travel to C:\Windows\System32 to find it.
The password is not stored in clear text in the registry when you do it this way.

Untitled

 

Option 2: sysinternals autologon

https://technet.microsoft.com/en-us/sysinternals/autologon

Autologin

 

Option 3: Registry or GPO

Enable AutoLogon

Key: HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Winlogon
Value: AutoAdminLogon (REG_SZ)
Data: 1 (Enabled)

Default Domain Name

Key: HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Winlogon
Value: DefaultDomainName (REG_SZ)
Data: DOMAINNAME

Default User Name

Key: HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Winlogon
Value: DefaultUserName (REG_SZ)
Data: USERNAME

Default Password

Key: HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Winlogon
Value: DefaultPassword (REG_SZ)
Data: PASSWORD

Warning: In option 2 and 3, be sure to also block the regedit tool onto this computer as anyone logged on the computer will be able to see the account password stored in the registry as clear text