Learning Power BI and Free Stuff

learning

I decided to use some of my time off over the holiday break to catch up on training (check out MS Ignite on-demand).  In particular, I’ve been working on a Power BI dashboard for a customer, and while I get the basics, wanted to start to learn some more advanced skills.  I took some Power BI courses through Lynda (check out my playlist) but wanted some more depth.

I typically like to read books to learn new things, especially ones that have a lot of hands on examples you can follow.  Then I stumbled on Reza Rad’s website.  He is a MVP that focuses on Power BI and related technologies.  What blew me away was that Reza is offering his book (Power BI: From Rookie to Rock Star).  To start, the book looks like it’s a compilation of various blog entries and some additional content (over 1,000 pages, and kept up to date).  Its broken in to topics that you can follow along fairly easily.  I did have to install SQL express and figure out a few things with a newer data set, but was impressed how much actually worked as-is.  Occasionally, I do wish he would go deeper or it seems like a few steps were skipped, but figuring out how to do things is what really makes you learn it. 

The fact that he gives it away is equally interesting.  Why free? He writes

I never write book for money, I write because I like to get a wider audience in front of me, and tell them about the great product, and best practices of doing things with that and so on. With famous publishers I would definitely get more audience. However when the content be available for free, and online then everyone would read it, search engines would direct audience to this content, and audience range will expand.

If you read his book, you will have no doubts on his experience and subject matter expertise. I do think this marketing strategy is a good way to get your audience hooked and builds credibility quickly – and may lead to spending more for his videos or live training. So, thanks Reza for sharing so much quality info to the Power BI community, this really helps those of us getting started!

The Journey

journeyMy daughter is currently applying to colleges and the experience has been interesting for the both of us.  While doing recon on admissions processes, I came across this blog post “Position vs. Disposition“, by Rick Clark.  While the message has been told many times in many ways – it’s the journey, not the destination – I though his version told through his experience was a powerful message.  His context of being rejected by a college should be celebrated (certainly after the sting has gone) – as the work you put into pushing yourself to reach a goal is ultimately a reward itself.

“..while you may not have been given a position in said college, you have earned something no admission letter will ever give you—a disposition formed through growth, maturity, and commitment.”

-Rick Clark

This life message that is true in all situations, in work and in personal matters. However sweet the reward is, such as a promotion to a role you’ve wanted for a lifetime, remember that it’s what you did to get there that is the true benefit and stays with you forever.

Exchange Archives

archivesWhen it comes to Exchange, one of the confusing things for customers is the Exchange Archive feature – especially for customers coming from an existing 3rd party archiving solution.  When I work with customers who are upgrading to a newer on-premises version, or Exchange Online, and have a current archiving system in place, the first thing I ask is what is the current solution used for? Archives are used for either compliance reasons (e.g. retention, records, litigation, legal requirements, etc.) or to extend mailbox capacity (e.g. provide large mailboxes by using lower cost storage). Occasionally, the archive may serve both functions.

When planning for the new end-state design – the question is what do to?  Most customers assume they should just deploy Exchange Online archiving.  This post will give some reasons to reconsider that decision.  [Spoiler] Exchange’s online archive feature has nothing to do with compliance.

Exchange Archive: The Origin Story

in the beginning

The origin of the archive feature (the name has changed many times over the years) was first introduced in Exchange 2010.  One of the goals in Exchange 2010 was to provide support for large mailboxes (which in retrospect were not all that large compared to Office 365 today!)  The main problem was that Outlook 2010’s cached mode would cache the (whole) mailbox, so rather than rely on a change to the Outlook client, Exchange added the archive feature – which is an extention to your mailbox that would not be cached.  If you deployed an archive, you could enjoy a very large mailbox, and not need to cache all your mail. For on-premises deployements, you could even put the archive on seperate storage, or even seperate servers. This was great since really large mailboxes take a very long time on the initial download or if you had to recreate your profile (which for many customers is a standard troubleshooting step).  Also, many laptops were very limited on drive space.

What about compliance features and the online archive?  The online archive actually did not bring any new compliance features with it.  All the compliance features apply to the mailbox – the whole mailbox – not just the primary mailbox.  Any retention or legal hold applied to the person apply to both the primary and the archive, or just the primary mailbox if an archive was not used.  In other words – having an archive added no additional compliance capabilies.  This was true in Exchange 2010, and is still true today.

Why Deploy an Online Archive?

If we don’t get additional features, then why deploy an online archive?

  1. You exceed the capacity of your primary mailbox storage (currently at the time of writing this, Office 365 E3 includes a 100GB primary mailbox)
  2. You have Outlook 2010 (or older) clients and want to have large mailboxes. Given Outlook 2010 is out of support, customers should be doing everything possible to upgrade.

If you have deployed an archive product for addressing mailbox capacity issues, then I strongly recommend that you do not deploy the online archive by default. Why not?

  • Not all mail clients can access the online archive
  • Mobile clients cannot search the online archive
  • It more complex and can be confusing to people

In this scenario, just use a large primary mailbox as Outlook 2013 or newer have the option of setting the amount (based on time) of cached content.  This cache setting effectively works just like having the archive (since content not in your cache is only available while online).

slider

If you deployed an archive product to meet compliance or records management needs, consider using the native Exchange features such as hold, retention, MRM, and labels.  Keeping all email within Exchange versus using an external archive product lets you easily perform content and eDiscovery searches.  Also, its much easier to manage your data lifecycle with the mail being on one solution.  I’ll reiterate – these compliance and records features work in Exchange regardless if you deploy the Exchange online archive or not.  In other words, you could retire your external archive, only use a primary mailbox, and enable retention policies to continue providing an immutable copy of the person’s mailbox data.

A very common scenario for customers as they move to Office 365 is to ingest all their 3rd party archive data, and PST (local / personal archives) in to Office 365.  Given this could be a lot of data, exceeding the 100GB limit, customer migrate this data directly into the online archive.  Exchange Online does offer an unlimited, auto-expanding archive.  Note that for migrations, the archive expansion takes time – so you cannot just import everything at once.  Once the content is in Exchange, retention policies can be applied to all content to start to control your enterprise data and limit risk exposure.

As long as the archive on the source system corresponds to a mailbox, this type of migration is straight forward.  If your archive solution is for journaled mail, typically the archive is not associated to specific mailboxes.  This is much harder to ingest in to Exchange, and a better strategy could be to just sunset the journal solution (let it age out) and moving forward implement retention and the other compliance features mentioned above.  A nice benefit of using retention over journaling is journaling only captured email sent and received.  There are scenarios where people shared folders to trade messages, which never actually go through transport!

Hopefully this sheds some light and helps you decide when to use Exchange online archives, how they work, and the benefits / drawbacks if you do plan to use them.

High Volume Mailbox Moves

letters

One challenges of planning your migration to Office 365 is how fast can you go?  Office 365 migration performance and best practices covers some great information, but I’ll add to it here based on my experience with real world projects.

Spoilers Ahead

One of my most recent engagements is wrapping up and I have done some anaylsis on the summary statistics. Note this was a move from a legacy dedicated version of Office 365 – so the throughput can be a bit higher than coming from an on-premises Exchange deployment.  On average (throwing out the high and low values) we moved about 3,000 mailboxes per week. One of the most impressive things from this migration was actually that it included a deployment of Office Pro Plus.  There was only a couple of months for planning – and deploying to over 30,000 workstations with very little impact to the helpdesk was a great surprise.

Another project I’m working on we have just started pilot migrations coming from on-premises Exchange 2010 servers.  Initially, we saw pretty limited performance when routing traffic through typical network infrastructure (e.g. hardware load balancer).  When we changed the configuration, we more than doubled our throughput and continued to tune it resulting in our last test was over 50GB/hr (our initial test was closer to 4 GB/hr).  Not too bad!

Migration Architecture

How did we get this speed boost?  A typical architecture for accessing mail (in this case the /EWS endpoint) is done over https (443) from the client to the hardware load balancer (HLB).  You may have a reverse proxy in front of the HLB, and you may have an additional interior firewall.  Some customers do not allow external access to the EWS virtual directory, but as part of establishing hybrid connectivity with Office 365 this is required.

mrs1

You may just reuse the same endpoint for the MRS traffic.  In this case your mailbox migrations will follow the same data path as the rest of your traffic.  A few additional constraints may need to be met: a publicly signed certificate, and you cannot bridge the SSL traffic (break encryption and re-encrypt).  If you meet this bar, then this design will meet the minimal requirements for MRS – however it may not perform very well as there are so many layers of infrastructure its traversing, plus it may impact the total available bandwidth of those devices.  Creating 1:1 MRS endpoints is a way to bypass all of this infrastructure, and ramp up throughput.

mrs2

In this example, three new DNS names are created, each resolving to a specific server. The firewall must allow traffic only from Exchange Online to the on-premises servers (see Office 365 URLs and IP address ranges).  The certificate with the additional MRS names will have to be redeployed to all the infrastructure (e.g. HLB) and Exchange servers (unless using a wildcard certificate – e.g. *.rosenlabs.com).  Now when you create migration requests you can choose across the endpoints.  For most customers, the ACL on the firewall is enough security to allow this configuration – at least for the duration of the mailbox migrations.

Other Considerations

There is always a bottleneck in the system, the question is do you hit it before you achieve the velocity speeds you would like to hit.  I work with customers to walk through every point in the data flow and see what the bottleneck will be.  In the original architecture above, the first bottleneck is nearly always the HLB – either because of its network connection, or the load it’s already under.  After that, the source Exchange servers tend to be unable to keep up and cause migration stalls. Also be aware of things like running backups that could severely impact resources. Finally, other items like the day after helpdesk support capacity or network downloads (OAB, or maybe changing your offline cache value) may prove to also limit your velocity speeds.  MRS migrations usually have a very low failure rate, but other ancillary things like mobile clients, etc. that are coupled with the migration need to be considered.