Ignite 2017 @ Orlando Day 4

Day 4:

Azure High Performance Networking

This was a very interesting session with lots of good info. It started of wit VNet integration of Azure Container Service and the ability to give an IP to a single container instead of sharing the IP with several containers.

VNet Service endpoints is also new which gives you the ability to deny internet access to VM’s but allow specific Azure services as Endpoint. So your VM’s can talk to Azure Services or Paas Services without you trying to figure out behind what IPs the endpoints are located and talking to the rest of the internet.

Then NSG’s got a bit less dumber then they were. The applied service tags to NSG’s. So what it means is that you can for example set a tag SQL Servers, or IIS Servers and make all IIS or SQL Servers being tagged by the policy. So you setup one rule with a tag SQL and all your SQL servers wil be bound to that NSG rule instead of creating several rules based on source IP’s of that SQL server.

Read more

Ignite 2017 @ Orlando Day 1

Today was the first day of Ignite 2017 which was about to kick off with a key note from Satya Nadella. Unfortunately it was a lot of the same slides and info as from Inspire 2017, so it was a bit of a waste of time, and since we had a lot of drinks at some very nice places in Orlando and a sprinkler fight with some InSpark colleague’s the night before it would have been nice to get a couple of more hours of sleep 😉 .

Empower IT and Developer Productivity with Azure
After the keynote i started with the session from Scott Gutherie. It was packed with info but a couple of things besides the session from Corey with Massive VM sizes with 128 Cores and multiple Terrabytes of memory were interesting to me:

  • Update management:
    Update Management is in preview now, and as i noticed in my own subscription not available for all machines, don’t no the prereqs for that yet. But you can enable Update management to scan vm’s for updates it needs on Windows and Llinux. You can also include Onprem Machines. It’s then displayed in a nice dashboard
  • Change Tracking
    With Azure Change Tracking in the OMS suite you can track changes in a VM through Log Analytics on a big nummer of resource. For example on File level, Registry, process and service level. Here to a slick displayed dashboard to get a good overview of what happend.

After a horrible lunch experience the real sessions would start. Here is a quick overview with some valuable take away for myself within my focus

Virtual Machine Diagnostics on Microsoft Azure
This was a short 20 minute session in the OCCC South hall Expo Theather #10. A new powershell script is release to get the health from a VM and output it to a json formated overview. With Get-AzureRMVmHealth.ps1 you can get a quick overview of several details like is my nic up, whats the ip, what port is used for RDP, is the admin account disabled, whats the username, are all vital services for remote access running and lots more! Give it a try with the following command

Read more

DPM 2016 Modern Backup Storage ReFS Volume offline and in RAW state

A couple of weeks ago a customer reported problems with a DPM 2016 server with Modern Backup Storage (MBS). Modern Backup Storage is a new approach of DPM to get a more efficient and flexible storage pool without the known limitations of the LDM database. With MBS you use storage spaces on the DPM server to add the disks to a large pool. Then on top of the storage space you create a volume or several with shares that the DPM server can use as storage instead of adding unallocated disks that DPM manages.

An explanation on MBS is not part of the scope of this blog but the problem that we had with it is related to Storage Spaces and more specific to ReFS volumes. ReFS is the new file system Microsoft has been cooking on for the last 6+ years. It has many improvements and several of them are used to make sure data corruption does not occur and if it occurs it repairs it automatically or takes actions to prevent it.

So why seems the volume on this DPM server offline, inaccessible and corrupt in a RAW file format?

In this setup

Read more

DPM 2016 Modern Backup Storage and Deduplication performance issue

Modern Backup Storage and Deduplication

One of the great features of Hyper-V in combination with a virtual DPM server was the ability to use deduplication. On the Hyper-V host you have a volume which is enabled for dedup where you place your .vhdx files on that serves as backup storage for the virtual DPM server. This way you can safe a lot of disk space.

Since DPM 2016 you can use Modern Backup Storage (MBS). With MBS you can use Storage Spaces to create a volume with shares that you present to DPM. Now I hear you thinking hey, we can enable Dedup on that volume and let DPM backup data to that volume and it get’s deduped so we can use a physical machine to… Unfortunately not, because MBS requires ReFS as file system on the disk, Dedup is ruled out because it is not yet supported and not available on ReFS volumes.

So we still need a physical Hyper-V host with an NTFS volume for the .vhdx files with dedup enabled. In the VM we create the storage space with a virtual disk and a ReFS volume and you are ready to cruise with your DPM server.

The setup

So the situation above also discribes the setup a bit. In short we have a physical host that has storage for the virtual DPM server. In this case it was a disk enclosure attached to the host with several SATA disks that we added to a storage space. On top of the storage space a mirrored volume was created to place the backup .vhdx files on. The host takes care of the deduplication of the backup data inside the .VHDX files. The vm takes care of the Storage Space and DPM.

The physical host has to take care of some dedub jobs like optimizing, garbage collection and scrubbing. These jobs cost a big amount of IO and a large amount of

Read more

Storage Spaces Direct Mirroring vs MRV (Parity) performance

HP Lefthand

Back several years ago (about 6 or 7 if i remember correctly) when Storage Spaces and Storage Spaces Direct (S2D) did not exist yet, there was a another vendor called “Lefthand” which did kind of the same trick. Lefthand was bought by HP and the product was renamed to P4000 and later on HP StorVirtual 4000. The principal of this type of storage is different to other vendors. These storage nodes use local raid level on several disks in a system and additional nodes with the exact same hardware are pooled into a cluster. On top of the cluster volumes with network raid level’s are created. So disks in a system (1 disk in case of a single local RAID 5 pool of disks) could die without losing the node. In this setup you have several layers of redundancy on a storage node, but also on the entire cluster.

You could start with 2 systems and create a mirrored volume. The raw capacity of 1 node minus the raid level was the total usable capacity. So take a two node setup with 12x 1TB disks you have give or take 21TB of usable capacity (12 disks in a RAID 5 = 12TB minus 1TB and minus some lost bits and bytes so give or take 10,5TB usable). In a two node setup you will have 10,5 TB of usable space with mirrored volumes because the data is mirrored across both nodes. Mirroring data like that brings you high availability on storage on a node level. So you could loss a storage node without the volume going offline but it will cost you half of the raw storage capacity. If you add an extra node to make a total of 3 nodes, you would have 10,5 * 3 = 31,5TB of raw capacity. Taking the mirroring in consideration you will have about 15,7TB of usable capacity. And this keeps going in a 10 node setup you have 105TB of raw capacity and about 50+TB of usable capacity. So all the time you will loss halve of the capacity in a Network Mirror volume. If you chose for 3-Way or even a 4-Way mirror (don’t know why but it is possible) you have massive redundancy and performance but a terrible efficiency because in a 4-Way mirror on 4 nodes you only have 25% of the RAW capacity available for data.

Yes.. it’s seems like a waste of space, so Lefthand (and later on HP) came up with Network RAID 5. When you have three or more nodes you could setup a volume with Network RAID 5. Then the data is places on 2 nodes and the 3 node is doing parity. You could still lose a disk in a node or an entire node…. BUT it was dreadfully slow and HP recommended AGAINST setting up Network RAID Level 5….
.
So I hear you thinking what is with all the “old stuff” on Lefthand… Well, Storage Spaces Direct is kind of the same principal and the same applies on Parity volumes… But bear with me on this 🙂
.

Mirror vs MRV Volumes.

With Storage Spaces the resiliency level is set on Volume level. That means that you can create Mirrored and Parity volumes and also a new flavor named Mixed Resiliency Volumes with both Parity and Mirrored space. With traditional hardware RAID, mirrored disks are always faster as Parity disks because of the parity process. The same applies for S2D. With Mirrored volumes all data is mirrored across an x amount of nodes and disks in the cluster. By default Storage Space Direct uses a 3-way mirror layout. All blocks written to disk are copied to 2 other nodes (in case of a 3 node or higher cluster). Because of this you default lose 2/3 of your raw capacity.

Microsoft S2D Program Manager Cosmos Darwin created a nice website to make some calculations on how much usable space you get with different combinations of disks, capacity and resiliency settings, check it out on http://aka.ms/s2dcalc

When you create a 1 TB 3-way mirrored volume that volume has a 3 TB footprint. That’s a simple calculation because 1 TB of data is copied 2 additional times in a 3 way-Mirror which makes 3 TB. When you create an MRV of 1 TB with for example 30% Mirrored capacity and 70% Parity capacity we have to do a bit more math. So 300 GB * 3 is 900 GB. Then we have 700 GB parity space that will require double the space for that a total of 1400 GB. The total footprint of a 1 TB MRV Disk is 900 GB + 1400 GB = 2300 GB. So with an MRV disk you save 700 GB of space on a 1 TB volume.

Because of the massive loss of capacity with 3-way mirror people (most of them are the people who are responsible for the budget) are forcing or highly recommending to use or consider Parity or a form of Mixed reciliency to get more GBs/TB’s out of there hardware.. But at what cost?

Read more

VMM 2016 and Network Controller certificate Issue’s

Since near the end of last year I was blessed with some hardware to test al lot of new features and stuff of Windows Server 2016, System Center 2016 and Azure Stack. Last week I experienced an issue with my Network Controller VM’s. In the end it turned out to be more of a VMM issue I think. But I wanted to share this with the world in case somebody else experienced this issue and does google for nothing because there is nothing to find about this issue.

Problem

I did the network controller and SLB Mux setup several weeks ago and all was running fine while all of a sudden I couldn’t change stuff in VMM anymore. Almost every action I did triggered this error:

Error (21426)
Execution of :: on the configuration provider  failed. Detailed exception: Unable to connect to the network service. Check connection string and network connectivity. Execution of Microsoft.SystemCenter.NetworkService::OpenDeviceConnectionEx on the configuration provider 3e2875a7-5831-4fb2-b388-1672e1c20fee failed. Detailed exception: System.Net.Http.HttpRequestException: An error occurred while sending the request. ---> System.Net.WebException: The underlying connection was closed: Could not establish trust relationship for the SSL/TLS secure channel. ---> System.Security.Authentication.AuthenticationException: The remote certificate is invalid according to the validation procedure.
Check the documentation for the configuration provider or contact the publisher support.
Unable to connect to the network service. Check connection string and network connectivity.

Recommended Action
Check the documentation for the configuration provider or contact the publisher support.

Troubleshooting

So I did a bunch of tests and troubleshooting

Read more