Automate Preservation/Retention for OneDrive for Business sites using Office 365 Complaince Center PowerShell

I worked on a small project a few months back where we had to automate retention/preservation for OneDrive for Business sites in Office 365. Here is what the high level requirements were for the automation:

  1. An attribute of the user object in the On-premise Active Directory will determine whether the user’s OneDrive site should be placed on an indefinite hold . The attribute being used in this case was “extensionAttribute13”.
  2. The “extensionAttribute13” for each user in the On-Premise Active Directory can either be empty/null or it can have a value “DONOTPURGE”. A value of “DONOTPURGE” indicates that the user’s OneDrive for Business site has to be put on legal hold. An empty/null value for the attribute means that the user’s OneDrive for Business site should NOT be put on hold, and the hold should be released if the site was previously on hold.

I decided to use Office 365 Compliance Center PowerShell to implement the automation. However, there were two major challenges in this implementation:

  1. We had to use a “specific-location” retention policy instead of an org-wide policy. As documented here, a location-specific policy can only contain 100 SharePoint/OneDrive sites. The number of users with retention hold could be several hundred so a single policy wouldn’t be enough to hold all the sites. The script had to be intelligent enough to dynamically create new policies if no empty “slots” were found in the existing policies. I mention “slots” here since we could have a situation where a policy was previously full which could have resulted in creation of new policies, however, some users in that policy had their hold released afterwards and now the policy has empty “slots” which should be reused.
  2. We had to dynamically determine the user’s OneDrive for Business site Url in Office 365 based on the user object in Active Directory. One approach that we could have used is documented here. However, this approach wouldn’t work if let’s say the script was run after the user’s profile had been marked for deletion because they have left the company, but the user’s OneDrive site was still not deleted as there is a default 30-day retention period, and there was a requirement to place an indefinite retention hold on that site. See this article to learn more about the default preservation/retention for OneDrive for Business sites. To address this challenge, I decided to use the user’s UPN which is also stored in the on-premise Active Directory to determine their OneDrive site dynamically. This is not the best approach, as it’s not very reliable, but this was the best one we could use.

The first part of the script is to initialize all the configuration variables which would need to be adjusted for every environment. The script also assumes that the initial retention policy (“test” in this example) has already been created. Here is the initialization section of the script:

 ############################## Begin Configuration #####################################################
#The logFileLocation is a folder where logs will be created and it must already exist
$logFileLocation = "C:\RetentionLogs" 
#The Url of the SPO Service for this tenant
$spoTenant = "https://therazas-admin.sharepoint.com"
#OneDrive site root
$odbSiteRoot = "https://therazas-my.sharepoint.com/personal"
#The UPN of the tenant admin
$tenantAdmin = "admin@therazas.onmicrosoft.com"
#The encrypted password will be stored here. If we don't find the file, we will simply prompt for password and save it here
$passwordFile = [System.Environment]::GetFolderPath('ApplicationData') + "\AdminPassword.txt"
#The Complaince center connection Uri. This won't need to be changed in most cases and its global
$complainceCenterUri = "https://ps.compliance.protection.outlook.com/powershell-liveid/"
#Name of the preservation policy
$policyName = "test"
#Maximum sites per policy
$MaxSitesPerPolicy = 100
#The AD search root for users
$searchRoot = "LDAP://dc=tmr,dc=com"
#Attribute Name that we will query AD for
$attributeName = "extensionAttribute13"
#Value of attributeName that will put the ODB Site on hold
$litigationHoldValue = "DONOTPURGE"
############################## End Configuration #####################################################

Here are some of the functions in the script:

  1. Write-LogFile: A function that simply logs messages in a log file for troubleshooting and understanding of what the script is doing.
  2. Retrieve-Credentials: Saves/retrieves the password for the $tenantAdmin account to/from an encrypted file on the local computer. The file is stored at the $passwordFile path.
  3. Ensure-Policies: This function is responsible for creating additional retention policies if needed to ensure that one policy does not contain more than 100 sites. The new policies will be based on on the initial policy name defined in $policyName variable. The new policies will have a number attached to them (1, 2, 3… and so on).
  4. Get-SitesInPolicy: Returns all sites that are added to one of the preservation policies. We have to retrieve this information to determine which users should have their retention hold released if the attribute has been cleared.
  5. Add-SiteToPolicy: Finds an open “slot” in the policies that exist and adds the site to the first available slot.
  6. Remove-SiteFromPolicy: Removes the specified site from the policy. There could be multiple policies to check so we loop through each policy until we find the site.

The complete script is provided below. Don’t forget to update the configuration section and install the latest version of SharePoint Online management shell before running the script.

 Import-Module Microsoft.Online.SharePoint.PowerShell -DisableNameChecking

############################## Begin Configuration #####################################################
#The logFileLocation is a folder where logs will be created and it must already exist
$logFileLocation = "C:\RetentionLogs" 
#The Url of the SPO Service for this tenant
$spoTenant = "https://therazas-admin.sharepoint.com"
#OneDrive site root
$odbSiteRoot = "https://therazas-my.sharepoint.com/personal"
#The UPN of the tenant admin
$tenantAdmin = "admin@therazas.onmicrosoft.com"
#The encrypted password will be stored here. If we don't find the file, we will simply prompt for password and save it here
$passwordFile = [System.Environment]::GetFolderPath('ApplicationData') + "\AdminPassword.txt"
#The Complaince center connection Uri. This won't need to be changed in most cases and its global
$complainceCenterUri = "https://ps.compliance.protection.outlook.com/powershell-liveid/"
#Name of the preservation policy
$policyName = "test"
#Maximum sites per policy
$MaxSitesPerPolicy = 100
#The AD search root for users
$searchRoot = "LDAP://dc=tmr,dc=com"
#Attribute Name that we will query AD for
$attributeName = "extensionAttribute13"
#Value of attributeName that will put the ODB Site on hold
$litigationHoldValue = "DONOTPURGE"
############################## End Configuration #####################################################
$Session = $null
#Logs the desired message to the log file location
Function Write-LogFile ([String]$Message)
{
    $Message = "[" + [DateTime]::Now.ToString() + "] " + $Message
   if (Test-Path $logFileLocation)
 {
       $fileName = $logFileLocation + "\" +  [DateTime]::Now.ToShortDateString().Replace("/", "_") + ".txt"
        $Message | Out-File $fileName -Append -Force
    }
   else
    {
       $errorM = "Log File Location " + $logFileLocation + " not found..."
     Write-Output $errorM
        Write-Output $Message
   }
}

Function Retrieve-Credentials ([Boolean]$SPO=$false)
{
    if (Test-Path $passwordFile)
      {
            $securePassword = Get-Content -Path $passwordFile | ConvertTo-SecureString
      }
      else
      {
            $securePassword = Read-Host -Prompt "Enter password" -AsSecureString
            $securePassword | ConvertFrom-SecureString | Out-File -FilePath $passwordFile -Force
      }
   if ($SPO -eq $true)
     {
     $creds = New-Object Microsoft.SharePoint.Client.SharePointOnlineCredentials($tenantAdmin, $securePassword)
    }
   else
    {
     $creds = New-Object System.Management.Automation.PSCredential($tenantAdmin, $securePassword)
      }
   return $creds
}

#This function accepts $TotalSites as a parameter which is the total number of sites in the environment that have a preservation hold for ODB sites
Function Ensure-Policies ([Int]$TotalSites)
{
 $currentPolicyName = $policyName
    [Int]$reqNumber = $TotalSites/$MaxSitesPerPolicy
    if ($TotalSites%$MaxSitesPerPolicy -eq 0)
   {
       $reqNumber = $reqNumber - 1
 }
   #Let's make sure the "base" policy exists to avoid unintentional policy creation. This will throw an exception if policy does not exist
 $policy = $null
 try
 {
       $policy = Get-RetentionCompliancePolicy -Identity $policyName
   }
   catch
   {
       $policy = $null
 }
   if ($policy -eq $null)
  {
       throw [System.InvalidOperationException] "Complaince policy $policyName does not exist. Please check the name and try again."
   }
   #We need to ensure that we have $reqNumber of polcies created with the naming format $policyName1, $policyName2 and so on...
    for($i = 1; $i -le $reqNumber; $i++)
    {
       $currentPolicyName = $policyName + $i
       $policy = $null
     try
     {
           $policy = Get-RetentionCompliancePolicy -Identity $currentPolicyName
        }
       catch
       {
           $policy = $null
     }
       if ($policy -eq $null)
      {
           #This means the policy does not exist. We need to create it
         $message = "Creating preservation policy $currentPolicyName with indefinite hold..."
            Write-LogFile $message
          $policy = New-RetentionCompliancePolicy -Name $currentPolicyName
            $rule = New-RetentionComplianceRule -Name "Indefinite Hold for $currentPolicyName" -Policy $currentPolicyName -PreservationDuration Unlimited
       }
   }
}

#Returns an ArrayList of all sites in preservation policies)
Function Get-SitesInPolicy
{
   $sitesList = New-Object System.Collections.ArrayList
    $currentPolicyName = $policyName
    $i = 0
  while ($true)
   {
       $policy = $null
     try
     {
           $policy = Get-RetentionCompliancePolicy -Identity $currentPolicyName
        }
       catch
       {
           $policy = $null
     }
       if ($policy -eq $null)
      {
           break
       }
       foreach ($location in $policy.SharePointLocation)
       {
           $x = $sitesList.Add($location.Name)
     }
       $i = $i + 1
     $currentPolicyName = $policyName + $i
   }
   return ,$sitesList
}

#Adds the specified site to an availalbe "slot" in the policies. 
Function Add-SiteToPolicy ([String]$SiteUrl)
{
   $currentPolicyName = $policyName
    $i = 0
  while ($true)
   {
       $policy = Get-RetentionCompliancePolicy -Identity $currentPolicyName
        if ($policy.SharePointLocation.Count -lt $MaxSitesPerPolicy)
        {
           $message = "Adding site " + $SiteUrl + " to the preservation policy " + $currentPolicyName
          Write-LogFile $message
          Set-RetentionCompliancePolicy -Identity $currentPolicyName -AddSharePointLocation $SiteUrl
          break
       }
       $i = $i + 1
     $currentPolicyName = $policyName + $i
   }
}

#Removes the specified site from the list of policies
Function Remove-SiteFromPolicy ([String]$SiteUrl)
{
   $currentPolicyName = $policyName
    $i = 0
  while ($true)
   {
       $policy = Get-RetentionCompliancePolicy -Identity $currentPolicyName
        foreach ($location in $policy.SharePointLocation)
       {
           if ($location.Name.Equals($SiteUrl))
            {
               $message = "Removing site " + $SiteUrl + " from the policy " + $currentPolicyName
               Write-LogFile $message
              Set-RetentionCompliancePolicy -Identity $currentPolicyName -RemoveSharePointLocation $SiteUrl
               return
          }
       }
       $i = $i + 1
     $currentPolicyName = $policyName + $i
   }
}
#Query the AD
try
{
 $old_ErrorActionPreference = $ErrorActionPreference
 $ErrorActionPreference = 'Stop'
 Write-LogFile "******************************Starting Script Execution******************************"
   $message = "Querying Active Directory for accounts on legal hold at path " + $searchRoot
    Write-LogFile $message
  $objOU = New-Object System.DirectoryServices.DirectoryEntry($searchRoot)
    $strFilter = "(&(objectCategory=User)("+$attributeName+"=" + $litigationHoldValue + "))"
    $objSearcher = New-Object System.DirectoryServices.DirectorySearcher
    $objSearcher.SearchRoot = $objOU
    $objSearcher.PageSize = 1000
    $objSearcher.Filter = $strFilter
    $objSearcher.SearchScope = "Subtree"
    $colProplist = "userPrincipalName"
  foreach ($i in $colPropList){$x = $objSearcher.PropertiesToLoad.Add($i)}
    $colResults = $objSearcher.FindAll()
    $message = "Found " + $colResults.Count + " result(s) from the AD search query"
 Write-LogFile $message
  $message = "Connecting to SPO service at url " + $spoTenant + " for validating ODB Sites" 
  Write-LogFile $message
  $creds = Retrieve-Credentials
   Connect-SPOService -Url $spoTenant -Credential $creds
   $userTable = New-Object System.Collections.Hashtable($colResults.Count)
 foreach ($objResult in $colResults)
 {
       #Updated 4/15 to handle the null UPN scenario
       $upnValue = $objResult.Properties["userPrincipalName"][0]
       if ($upnValue -eq $null)
        {
           #User does not have UPN Set. Continue
           Write-LogFile "Found user without UPN set. Skipping..."
         continue
        }
       $upn = $upnValue.ToString() 
        #Calculate the ODB Site Url
     $odbSite = $upn.Replace("@", "_")
       $odbSite = $odbSite.Replace(".", "_")
       $odbSite = $odbSite.Replace("-", "_")
       $odbSite = $odbSiteRoot + "/" + $odbSite
        $message = "Checking if ODB site for user " + $upn + " exists at " + $odbSite
       Write-LogFile $message
      try
     {
           $site = Get-SPOSite $odbSite
        }
       catch
       {
           $site = $null
       }
       if ($site -eq $null)
        {
           Write-LogFile "Site Not Found, Skipping this user..."
           continue
        }
       Write-LogFile "Site Found!"
     $userTable.Add($upn, $odbSite)
  }
   #$userTable now contains all users with valid ODB Sites. 
   #Let's make sure we have the required number of policies created (1 Policy = 100 sites)
 #Connect to the complaince center
   $Session = New-PSSession -ConfigurationName Microsoft.Exchange -ConnectionUri $complainceCenterUri -Credential $creds -Authentication Basic -AllowRedirection -WarningAction Ignore
 $x = Import-PSSession $Session
  Ensure-Policies -TotalSites $userTable.Count 
   #Next, we need to remove all ODB Sites that are already in the preservation policy
  $sitesList = Get-SitesInPolicy
  #Now lets determine which sites we need to add to policy
    $allUsers = $userTable.Keys.GetEnumerator()
 $addCount = 0
   while ($allUsers.MoveNext())
    {
       $currentUser = $allusers.Current
        $currentSite = $userTable[$currentUser]
     #Check if this site is already in the policy
        if ($sitesList.Contains($currentSite))
      {
           $message = "ODB Site for user " + $currentUser + " is already in the preservation policy. Skipping"
         Write-LogFile $message
          #Remove this site from the list so we can release the hold on the remaining sites
           $sitesList.Remove($currentSite)
     }
       else
        {
           try
         {
               Add-SiteToPolicy -SiteUrl $currentSite
              #Set-RetentionCompliancePolicy -Identity $policyName -AddSharePointLocation $currentSite
                $addCount = $addCount + 1
           }
           catch
           {
               Write-LogFile "Error Ocurred while adding site to policy." 
             Write-LogFile $_.Exception.Message
          }
       }
   }
   $message = "Sucessfully Added " + $addCount + " site(s) to the preservation policy"
 Write-LogFile $message
  #Now let's remove the sites from the hold. Everything left in $sitesList needs to be removed. The user either does not exist in AD or does not has $litigationHoldValue set.
    $removeCount = 0
    foreach ($siteUrl in $sitesList)
    {
       try
     {
           #Set-RetentionCompliancePolicy -Identity $policyName -RemoveSharePointLocation $siteUrl
         Remove-SiteFromPolicy -SiteUrl $siteUrl
         $removeCount = $removeCount + 1
     }
       catch
       {
           Write-LogFile "Error Ocurred while removing site from policy." 
         Write-LogFile $_.Exception.Message
      }
   }
   $message = "Sucessfully Removed " + $removeCount + " site(s) from the preservation policy"
  Write-LogFile $message
  Write-LogFile "******************************Ending Script Execution******************************"
}
catch 
{
 Write-LogFile "Error Ocurred." 
 Write-LogFile $_.Exception.Message
  #Updated 4/15 to handle the null UPN scenario
   Write-LogFile $_.ScriptStackTrace

}
finally
{
  #Clear the complaince center session
    if ($Session -ne $null)
 {
       Remove-PSSession $Session
   }
   $ErrorActionPreference = $old_ErrorActionPreference
}

Happy SharePointing!