Datacenter – Backup Services SLA

Viewing 7 reply threads
  • Author
    Posts
    • #338
      aswancer
      Participant

      Datacenter (core) services has a very broad scope with several major components including physical and logical protection, server management, database management and backups and restores. IT has four primary SLAs and one standard that are meant to address these components; 1. LoboCloud service 2. Database management service 3. Backup services 4. Co-Location Service and the datacenter standard. IT offers additional related services that can be found on the IT service catalog. This SLA is specific to backup services.

    • #409
      elisha
      Participant

      1. It might be a good idea to specify in the General Overview what kinds of machines may take advantage of this.

      It says in 2.1 that it is “Available for data stored on servers connected on the campus network;” Does that mean any fixed computer in any location on the campus network, or do the servers have to live in the data center?

      Can the number of backups/duration of retention be extended?

      How far away from the data center is the offsite backup? Is it far enough away to safely assume data continuity in the face of a major regional disaster?

      How is the integrity of backups monitored/measured? Is that a UNM IT function or an End User one?

      Are on-demand snapshots possible?

      2.1.1 – “Maintain and ensure devices have up-to-date virus/malware and protection and operating system
      (critical) updates installed within one week of vendor distribution;” This is not always possible for major systems.

      2.1.2 – “Customers must purchase additional storage prior to exceeding capacity;” is there monitoring that notifies end users when limits are being approached?

      2.2.2 – same question as I have on other SLAs regarding selection of 99.9% uptime for this service, how it is measured, etc. Are new backups triggered automatically when they fail due to backup service downtime?

      • #425
        bpietrewicz
        Keymaster

        Elisha,

        Thanks for taking the time to comment.  

        Q: It might be a good idea to specify in the General Overview what kinds of machines may take advantage of this.

        A: Section 2.1 bullet 2 specifies Windows, Unix and Linux operating systems are supported.

        Q: It says in 2.1 that it is “Available for data stored on servers connected on the campus network;” Does that mean any fixed computer in any location on the campus network, or do the servers have to live in the data center?

        A: The backup client will work for any system connected to a wired port on the campus network.  The system does not need to be in the datacenter.  Performance will vary based on available bandwidth. 

        Q: Can the number of backups/duration of retention be extended?

        A: It can but only but only by exception and only in extraordinary circumstances.  Our current backup system is limited in its capabilities.  We have purchased a new backup system and we are in the processes of implementing it.  The new system is far more flexible.  Look for enhancements in the near future.   

        Q: How far away from the data center is the offsite backup? Is it far enough away to safely assume data continuity in the face of a major regional disaster?

        A: The offsite backups are currently stored at the Pit.  We are in the process of conducting a business impact analysis (BIA).  One of the outcomes of the BIA is to drive requirements for a new disaster recovery plan.  I suspect that DR will eventually (within the next year or so) be done in the cloud.

        Q: How is the integrity of backups monitored/measured? Is that a UNM IT function or an End User one?

        A: Integrity checking of backups is done by reviewing the client’s backup logs on the server being backed up.  This is the customer’s responsibility.  Additional monitoring options will be available in the future. 

        Q: Are on-demand snapshots possible?

        A: Yes but not with the backup service.  On-demand snapshots are available to customers with systems hosted on our storage.  There are caveats and in some cases additional fee apply.

        Q: 2.1.1 – “Maintain and ensure devices have up-to-date virus/malware and protection and operating system
        (critical) updates installed within one week of vendor distribution;” This is not always possible for major systems.

        A: Exceptions can be made in extraordinary circumstances with reasonable justification.  

        Q: 2.1.2 – “Customers must purchase additional storage prior to exceeding capacity;” is there monitoring that notifies end users when limits are being approached?

        A: Reports are available but the customer must check the reports.  Look for improvements on this in the future. 

        Q: 2.2.2 – same question as I have on other SLAs regarding selection of 99.9% uptime for this service, how it is measured, etc. Are new backups triggered automatically when they fail due to backup service downtime?

        A: Uptime is measured by our monitoring system.  Backups occur nightly.  In the event of an outage during the backup windows, backups will restart where they left off the night before.

        Regards,

        Brian 

    • #410
      cdean
      Participant

      Although at this point Law has no plans to use this service, we reserve the right to create a customized SLA specific to our needs with mutually-agreed upon consequences for both Law and UNM IT.
      Cyndi Johnson

      • #426
        bpietrewicz
        Keymaster

        Cyndi,
        This is the SLA for the backup service that IT offers today.  We are open to discussing customized agreements that are mutually agreed upon.  

    • #420
      ayoder
      Participant

      2 Backup fees are not listed in service catalog entry. Data storage pricing is listed, is there an additional charge for the “Backup Service” license fee? etc.

      http://it.unm.edu/servicecatalog/asset_list.php?type=2&a_id=128&dept=247&origin=az

      There are also no add ons lists in the service catalog entry

      2.1 Should this section be under 3.1?

      2.1.1 “Notify security@unm.edu of any compromises or breaches” Why is this not a Help.UNM ticket?

      2.1.2 “Backup client is not capable of backing up databases. Native database tools are required” What backup product doesn’t support this? Should UNM IT be evaluating new backup solutions?

      3.1 “Basic up/down system monitoring” does this include storage monitoring?

      4.1 Where is the maintenance window listed for the backup service?

      5.2 “Requests will be fulfilled within fifteen (15) days” What happens in the event of an emergency for a department? 15 days seems pretty generous for responding to a service request for this service. Exception process seems overly complex with too many approvals, sounds like by the time all the approvals and sign offs were obtained we would be at 15 days for a normal service request.

      • #454
        bpietrewicz
        Keymaster

        Andrew,

        Thanks for taking the time to comment.

        Q: 2 Backup fees are not listed in service catalog entry. Data storage pricing is listed, is there an additional charge for the “Backup Service” license fee? etc.

        A: There is no fee for licensing.  We only charge for the amount of storage purchased/used. 

        Q: 2.1 Should this section be under 3.1?

        A: No, 2.1 is the section for features, 3.1 is for UNM IT responsibilities.  Perhaps I don’t understand the question.?  

        Q: 2.1.1 “Notify security@unm.edu of any compromises or breaches” Why is this not a Help.UNM ticket?

        A:  The state is accurate.  For more information please see: http://it.unm.edu/security/

        Q: 2.1.2 “Backup client is not capable of backing up databases. Native database tools are required” What backup product doesn’t support this? Should UNM IT be evaluating new backup solutions?

        A: No database products are currently supported.  We have purchased a new backup system and we are in the process of implementing it.  The new system supports several databases.  Look for new features soon.  

        Q: 3.1 “Basic up/down system monitoring” does this include storage monitoring?

        A: Yes, it includes up/down of all components of the backup system except the clients and client servers.  Monitoring clients and the servers being backed up is the customer’s responsibility.  Storage utilization is not monitored in an automated fashion.  Reports are available for storage utilization.  It is the customer’s responsibility to review the reports. 

        Q: 4.1 Where is the maintenance window listed for the backup service?

        A: There is no limit to when backups can be run.  It is strongly recommended to run backups between 5pm and 5am.  I will update the SLA.  

        Q: 5.2 “Requests will be fulfilled within fifteen (15) days” What happens in the event of an emergency for a department? 15 days seems pretty generous for responding to a service request for this service. Exception process seems overly complex with too many approvals, sounds like by the time all the approvals and sign offs were obtained we would be at 15 days for a normal service request.

        A: 15 days is time to initially deliver the service.  This includes provisioning the storage, training, and handling any nuances that might arise.  If it is an emergency it can be discussed when the ticket is acknowledged which is within 12 hours.   

        Regards,

        Brian 

    • #449
      jwong
      Participant

      Some of my comments are similar to Elisha’s.  Although Brian had clarified some of them already, I am going to list them anyway. 

      2.1 What versions of Windows OS and Linux distributions will the backup client support? 

      2.1.2 What do you mean by revision?  Is this version control?  Is it possible to retain more than the past 3 versions?  Is revision a file or block level backup?  Can the backup retention period extend beyond 180 days? Do you also provide backup service for bare metal recovery?

      Is there some type of reporting tool or high water mark alerts to let the customer know the allotted backup space is close to the quota limit?    

      Is there a cost sheet on backup space? Is the cost based on GB of space, based per client base, or client plus space?  

      How far is the replication site from the backup site? 

      Although at this point Financial Services Division (FSD) has no plans to use this service at this time, we reserve the right to create a customized SLA specific to our needs with mutually-agreed upon consequences for both FSD and UNM IT.

      • #458
        bpietrewicz
        Keymaster

         
        Jenny:

        Thanks for taking the time to comment.

        Q: 2.1 What versions of Windows OS and Linux distributions will the backup client support?

        A: The backup system supports the clients listed at the following link:
        http://www-01.ibm.com/support/docview.wss?uid=swg21243309
         
        Q: 2.1.2 What do you mean by revision?  Is this version control?  Is it possible to retain more than the past 3 versions?  Is revision a file or block level backup?  Can the backup retention period extend beyond 180 days? Do you also provide backup service for bare metal recovery?

        A: Revision means change in this case.  Every time the file changes it is considered a revision.  

        A: We can only change the retention policies (3 versions, 180 days) in extraordinary circumstances.  Managing multiple retention policies is extremely resource intensive.  We are in the process of implementing a new backup system so look for the feature set to improve in the near future.   

        A: Backups are file level backups.

        A: We do not currently offer bare metal at this time.  It may be available in the future for an additional charge.

        Q: Is there some type of reporting tool or high water mark alerts to let the customer know the allotted backup space is close to the quota limit?

        A: Yes storage utilization reports are available.     

        Q: Is there a cost sheet on backup space? Is the cost based on GB of space, based per client base, or client plus space?  

        A: Good question and the answer is a bit complicated.  If the system being backed up is not owned by IT then it is the normal cost of storage or $700 per TB. http://it.unm.edu/servicecatalog/asset_list.php?service=23&product=128&origin=servicelist

        A: If the server being backed up is owned/managed by IT, it is half the cost of the drive that is being backed up or $350 per TB.  Ex: A LoboCloud system with a 1TB drive costs $350 to back up.  I will update the service description.   

        Q: How far is the replication site from the backup site? 

        A: Data is replicated to a building on South Campus.  Replication to a cloud provider is on the roadmap. 

        Q: Although at this point Financial Services Division (FSD) has no plans to use this service at this time, we reserve the right to create a customized SLA specific to our needs with mutually-agreed upon consequences for both FSD and UNM IT.

        A: This is the SLA for the backup service that IT offers today.  We are open to discussing customized agreements that are mutually agreed upon.  

        Regards,

        Brian

    • #477
      barchu02
      Participant

      2 Service Description
      The backup server keeps 3 revisions of active files and 1 revision of files that have been deleted from the
      client. Deleted files are retrievable for 180 days.

      What is the definition of an active file? If I leave a file untouched for an extended period of time is it inactive?

      2.1.2 Boundaries of Service Features and Functions
      Customer may not exceed previously agreed upon storage capacity when using the backup and restore service; Customers must purchase additional storage prior to exceeding capacity;

      What happens after storage capacity is exceded, is there are warning? Are zero-byte files kept in place of real files?

      2.2.2 Specific Service Levels
      Additional backup storage available within three (3) business days pending availability;
      3 business days seems like a long period of time to wait.

      • #488
        bpietrewicz
        Keymaster

        Q: 2 Service Description
        The backup server keeps 3 revisions of active files and 1 revision of files that have been deleted from the
        client. Deleted files are retrievable for 180 days.
        What is the definition of an active file? If I leave a file untouched for an extended period of time is it inactive?

        A: Active means the file has not been deleted / is available. 

        Q: 2.1.2 Boundaries of Service Features and Functions
        Customer may not exceed previously agreed upon storage capacity when using the backup and restore service; Customers must purchase additional storage prior to exceeding capacity;
        What happens after storage capacity is exceded, is there are warning? Are zero-byte files kept in place of real files?

        A: IT will notify the customer that the allotted storage has been exceeded.  The customer will either need to clean up the storage or purchase additional storage.  The service is will continue to function normally. 

        Q: 2.2.2 Specific Service Levels
        Additional backup storage available within three (3) business days pending availability;
        3 business days seems like a long period of time to wait.

        A: 3 days is our standard turnaround time for most service request.  Requests are often completed faster but 3 days is the time that we can commit to.   

    • #500
      susier
      Participant

      CARC Systems comment: 2.1.1 seems to indicate that the end user (UNM unit?) is responsible for monitoring log files for failures: This seems like it should be a part of the service provided. It should not be the user’s responsibility to monitor this and report errors in backups— it should be the other way around.

    • #502
      susier
      Participant

      CARC Systems comment: 2.1.2 states that “customers may not exceed previously agreed upon storage capacity when using the backup and restore service; Customers must purchase additional storage prior to exceeding capacity”. this seems to be a bit backwards: users may not realize that they are exceeding capacity for example until after they have finished a storage task. Why not adopt a “clean up or pay up” policy to give the user the option to expand their capacity within a certain timeframe of a notice?

Viewing 7 reply threads
  • The topic ‘Datacenter – Backup Services SLA’ is closed to new replies.